mid-training

Tag

Cards List
#mid-training

@NielsRogge: What is mid-training? The stage between pre-training and post-training A base model is continued on a smaller, curated …

X AI KOLs Timeline · 2026-06-02 Cached

Explains mid-training as a stage between pre-training and post-training, where a base model is continued on curated data to strengthen specific capabilities before instruction tuning.

0 favorites 0 likes
#mid-training

MIRA: Mid-training Rubric Anchoring for Source-Aware Data Selection

Hugging Face Daily Papers · 2026-05-29 Cached

MIRA is a data selection framework for the mid-training stage of LLM development that adaptively constructs quality rubrics per data source, using a teacher model to propose dimensions and distilling into lightweight scorers. It achieves superior performance using only half the tokens compared to full-corpus training.

0 favorites 0 likes
#mid-training

Mid-Training with Self-Generated Data Improves Reinforcement Learning in Language Models

arXiv cs.AI · 2026-05-12 Cached

This paper investigates how using diverse self-generated data during mid-training improves the effectiveness of Reinforcement Learning in Large Language Models, particularly for reasoning tasks.

0 favorites 0 likes
#mid-training

Mid-Training with Self-Generated Data Improves Reinforcement Learning in Language Models

Hugging Face Daily Papers · 2026-05-08 Cached

This paper proposes mid-training language models on self-generated diverse reasoning traces before reinforcement learning, showing improved RL performance on math benchmarks by exposing models to multiple valid solution approaches.

0 favorites 0 likes
← Back to home

Submit Feedback