Tag
This paper identifies a vocabulary gap as the root cause why advanced encoders like ModernBERT underperform in learned sparse retrieval, and proposes Vocabulary Transfer (VT), a model-agnostic framework that migrates encoders to sparse-friendly vocabularies, achieving state-of-the-art on the BEIR benchmark.