mid-training

Tag

Cards List
#mid-training

Mid-Training with Self-Generated Data Improves Reinforcement Learning in Language Models

arXiv cs.AI · 2026-05-12 Cached

This paper investigates how using diverse self-generated data during mid-training improves the effectiveness of Reinforcement Learning in Large Language Models, particularly for reasoning tasks.

0 favorites 0 likes
← Back to home

Submit Feedback