Tag
Garry Tan highlights a retrieval system that uniquely combines keyword matching, graph traversal, and gap analysis, an approach not seen elsewhere.
The paper introduces Direct Corpus Interaction (DCI), a novel approach allowing AI agents to query raw text directly using standard terminal tools instead of traditional embedding-based retrieval. By bypassing fixed similarity interfaces and offline indexing, DCI significantly outperforms conventional sparse, dense, and reranking baselines across multiple IR and agentic search benchmarks.
A dual-view data synthesis method using polarity reversal boosts instruction-following retrieval performance by 45% on the FollowIR benchmark.