Xetrieval: Mechanistically Explaining Dense Retrieval
Summary
Xetrieval is a mechanistic framework that explains dense retrieval by enhancing sentence embeddings with reasoning information and decomposing them into interpretable sparse features, providing feature-level explanations for retrieval decisions without expensive autoregressive generation.
View Cached Full Text
Cached at: 05/29/26, 03:02 PM
Paper page - Xetrieval: Mechanistically Explaining Dense Retrieval
Source: https://huggingface.co/papers/2605.29507
Abstract
Xetrieval is a mechanistic framework that explains dense retrieval by enhancing sentence embeddings with reasoning information and decomposing them into interpretable sparse features for retrieval decision explanations.
Explaining whydense retrieversassign high relevance scores remains challenging becauseretrieval decisionsare made through opaquehigh-dimensional embeddings. Existing explanations often focus on surface signals, such as lexical matches, token alignments, or post-hoc textual rationales, and thus provide limited insight into the latent factors that shape dense retrieval behavior at the embedding level. We propose Xetrieval, an embedding-level mechanistic framework for explaining dense retrieval. Xetrieval first introduces a lightweightreasoning internalizerthat approximatesChain-of-Thought reasoningdirectly in theembedding spacewith a single forward pass, enriching sentence embeddings with reasoning-oriented information while avoiding expensive autoregressive generation. It then decomposes these reasoning-enhanced embeddings into sparse,human-interpretable features, each associated with a coherent natural language description. By aggregating sparse feature overlaps across multiple document-side views, Xetrieval providesfeature-level explanationsof individualretrieval decisions. Experiments on diverse retrievers and benchmarks show that Xetrieval uncovers coherent interpretable features, yields strongerpair-level intervention effects, and supportstask-level feature steering. The project page and source code are available at https://hihiczx.github.io/Xetrieval .
View arXiv pageView PDFProject pageGitHub12Add to collection
Get this paper in your agent:
hf papers read 2605\.29507
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2605.29507 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2605.29507 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2605.29507 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
@_reachsumit: Latent Terms: Dense Retrievers Contain Trivially Extractable BM25-ready Zipfian Vocabularies @bclavie et al. extract in…
The paper proposes Latent Terms, a method using Sparse Autoencoders to extract BM25-ready sparse features from frozen dense retrievers, achieving competitive performance without retrieval-specific training.
Rethinking Reasoning-Intensive Retrieval: Evaluating and Advancing Retrievers in Agentic Search Systems
The paper introduces BRIGHT-Pro, a new benchmark for reasoning-intensive retrieval, and RTriever-Synth, a synthetic corpus used to fine-tune RTriever-4B for improved performance in agentic search systems.
@lateinteraction: The keynote recording is now on YouTube, for everyone who asked us to host it outside X. https://youtube.com/watch?v=Z2…
A keynote recording argues that late interaction retrieval (e.g., ColBERT-style) is the most promising direction in AI-scale information retrieval research, contending that single-vector dense retrieval is fundamentally flawed and that the IR community must raise its ambitions significantly. The talk introduces the LIMIT benchmark as evidence of dense retrieval's generalization failures and calls for a paradigm shift by 2030.
Q-RAG: Long Context Multi-step Retrieval via Value-based Embedder Training
Q-RAG introduces a reinforcement learning-based fine-tuning approach for embedder models to enable efficient multi-step retrieval, achieving state-of-the-art results on long-context benchmarks up to 10M tokens. This method provides a resource-efficient alternative to fine-tuning small LLMs for complex multi-step search tasks.
Retrieval from Within: An Intrinsic Capability of Attention-Based Models
INTRA demonstrates that attention-based models can perform retrieval directly from internal representations, unifying retrieval and generation while improving evidence recall and answer quality.