Tag
Single-vector embedding models can be used to extract sparse latent terms, and BM25 can turn this vocabulary into a strong retriever.
The paper proposes Latent Terms, a method using Sparse Autoencoders to extract BM25-ready sparse features from frozen dense retrievers, achieving competitive performance without retrieval-specific training.