@antoine_chaffin: Reason-ModernColBERT nearly solved BrowseComp-Plus, smashing SOTA and outperforming models models 54× bigger Not bad fo…
Summary
Reason-ModernColBERT achieves near-perfect results on BrowseComp-Plus, surpassing SOTA and models 54× larger, then Agent-ModernColBERT further improves with minimal training.
View Cached Full Text
Cached at: 05/12/26, 02:52 PM
Reason-ModernColBERT nearly solved BrowseComp-Plus, smashing SOTA and outperforming models models 54× bigger Not bad for a 1 year old model not optimized for deep research What if we actually tried? Introducing Agent-ModernColBERT: adding another 10% on top with a 5 min training https://t.co/yLeItKXwba
Similar Articles
@LightOnIO: Reason-ModernColBERT topped BrowseComp-Plus with just 149M parameters. Now, Agent-ModernColBERT adds ~10% on top. Reach…
LightOn released Agent-ModernColBERT, a 149M parameter open-source retrieval model that achieves performance comparable to GPT-5 combined with Qwen3-Embed-8B by integrating agent reasoning traces into queries.
@AmelieTabatta: ColBERT models continue to embarrass models 54× their sizes , this is why we trust late interaction @LightOnIO . A 1-ye…
The article highlights how ColBERT models, despite being smaller and older, outperform larger models like Qwen3-embed-8B when coupled with late interaction techniques and minimal fine-tuning.
@antoine_chaffin: It’s only BEIR but there are almost 10 points gap between v2 and LateOn We also have good evidence that the model gener…
LateOn, a new generation ColBERT model, achieves a nearly 10-point improvement over v2 on BEIR and generalizes well outside BEIR, with the same usage in PyLate.
@bo_wangbo: We causally trained a lot of SOTA search models internally, shall we make some small release from time to time
暗示即将以低调方式发布一个强大的开源多语言ColBERT搜索模型。
@bo_wangbo: okay maybe it's a good time? We have a small colbert model trained at pplx, it is a continue-training of pplx-embed-0.6…
Perplexity AI releases pplx-embed-v1-late-0.6b, a small ColBERT late-interaction embedding model for retrieval, fine-tuned from their existing embedding model and optimized for MaxSim scoring, now open-source on HuggingFace.