Tag
LEAD dynamically adapts reasoning efficiency during training by using online calibration of correctness-efficiency trade-offs and adaptive problem-specific length targets, improving mathematical reasoning accuracy and reducing output length.
Researchers introduce x1, a family of reasoning models that adaptively select optimal languages for reasoning on a per-instance basis, demonstrating that language choice impacts reasoning quality in multilingual and cultural tasks.