encoder-pretraining

Tag

Cards List
#encoder-pretraining

A Causal Language Modeling Detour Improves Encoder Continued Pretraining

Hugging Face Daily Papers · 2026-05-12 Cached

This paper demonstrates that switching from Masked Language Modeling to Causal Language Modeling during encoder adaptation improves downstream performance on biomedical texts. The authors release ModernBERT-bio and ModernCamemBERT-bio as state-of-the-art biomedical encoders.

0 favorites 0 likes
← Back to home

Submit Feedback