Tag
DREAM trains dense retrieval embeddings by using autoregressive language model attention to supervise query-document similarity, eliminating the need for labeled data. It consistently outperforms baselines on BEIR and RTEB benchmarks across model scales.
This paper identifies document-side early compression as a failure mode in long-document dense retrieval and introduces the Evidence Dilution Index (EDI) to measure it. The authors propose DICE, a training-free method that splits documents into chunks, encodes them independently, and aggregates them into a single vector, significantly improving retrieval on long documents.
MCompassRAG enhances retrieval-augmented generation by enriching chunk representations with topic metadata and using LLM-teacher distillation, achieving 8.24% average improvement in information efficiency with over 5x lower latency compared to strong baselines.
ECI_sem is a training-free method for ranking hard negative sources in dense retrieval using frozen embeddings, achieving strong performance on MS MARCO and BEIR benchmarks.
The LateOn model with 140M parameters achieves strong results, and the community is excited about advances in multi-vector models including new CPU indexes and multilingual support.
The paper proposes Latent Terms, a method using Sparse Autoencoders to extract BM25-ready sparse features from frozen dense retrievers, achieving competitive performance without retrieval-specific training.
CoHyDE introduces an iterative co-training procedure for an LLM rewriter and a dense encoder to improve tool retrieval from large API catalogs. It outperforms single-component baselines, especially on vague queries, by training both components together using InfoNCE and DPO.
Xetrieval is a mechanistic framework that explains dense retrieval by enhancing sentence embeddings with reasoning information and decomposing them into interpretable sparse features, providing feature-level explanations for retrieval decisions without expensive autoregressive generation.
This paper benchmarks Google Embeddings 2 against five open-source models for multilingual dense retrieval and RAG, finding GE2 top in accuracy but slower, with mE5-L as a competitive low-latency alternative.
Raphael released two open-source retrieval models, LateOn (ColBERT multi-vector) and DenseOn (single-vector), each 149M parameters and outperforming 4× larger models on BEIR.
Spectral Tempering (SpecTemp) proposes a learning-free method for embedding compression in dense passage retrieval that adaptively determines optimal spectral scaling based on signal-to-noise ratio analysis, outperforming fixed hyperparameter approaches like PCA and whitening.
A keynote recording argues that late interaction retrieval (e.g., ColBERT-style) is the most promising direction in AI-scale information retrieval research, contending that single-vector dense retrieval is fundamentally flawed and that the IR community must raise its ambitions significantly. The talk introduces the LIMIT benchmark as evidence of dense retrieval's generalization failures and calls for a paradigm shift by 2030.