UniSD: Towards a Unified Self-Distillation Framework for Large Language Models
Summary
This paper introduces UniSD, a unified self-distillation framework for adapting large language models that integrates mechanisms for supervision reliability, representation alignment, and training stability. Experimental results show that UniSD improves performance over base models and existing baselines across multiple benchmarks.
View Cached Full Text
Cached at: 05/11/26, 07:22 AM
Paper page - UniSD: Towards a Unified Self-Distillation Framework for Large Language Models
Source: https://huggingface.co/papers/2605.06597
Abstract
Self-distillation framework UniSD systematically addresses challenges in autoregressive language model adaptation through integrated mechanisms for supervision reliability, representation alignment, and training stability.
Self-distillation(SD) offers a promising path for adapting large language models (LLMs) without relying on stronger external teachers. However, SD inautoregressive LLMsremains challenging because self-generated trajectories are free-form, correctness is task-dependent, and plausible rationales can still provide unstable or unreliable supervision. Existing methods mainly examine isolated design choices, leaving their effectiveness, roles, and interactions unclear. In this paper, we proposeUniSD, a unified framework to systematically studyself-distillation.UniSDintegrates complementary mechanisms that address supervision reliability, representation alignment, and training stability, includingmulti-teacher agreement,EMA teacher stabilization,token-level contrastive learning,feature matching, anddivergence clipping. Across six benchmarks and six models from three model families,UniSDreveals whenself-distillationimproves over static imitation, which components drive the gains, and how these components interact across tasks. Guided by these insights, we constructUniSDfull, an integrated pipeline that combines complementary components and achieves the strongest overall performance, improving over the base model by +5.4 points and the strongest baseline by +2.8 points. Extensive evaluation highlightsself-distillationas a practical and steerable approach for efficient LLM adaptation without stronger external teachers.
View arXiv pageView PDFAdd to collection
Get this paper in your agent:
hf papers read 2605\.06597
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2605.06597 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2605.06597 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2605.06597 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
ReAD: Reinforcement-Guided Capability Distillation for Large Language Models
This paper introduces ReAD, a reinforcement-guided capability distillation framework that optimizes token budgets by accounting for cross-capability transfer in large language models. It demonstrates improved downstream utility and reduced harmful spillover compared to existing baselines.
Self-Distillation Zero: Self-Revision Turns Binary Rewards into Dense Supervision
Self-Distillation Zero (SD-Zero) is a novel training method that converts sparse binary rewards into dense token-level supervision through dual-role training where a model acts as both generator and reviser, achieving 10%+ improvements on math and code reasoning benchmarks with higher sample efficiency than RL approaches.
The Many Faces of On-Policy Distillation: Pitfalls, Mechanisms, and Fixes
This paper presents a comprehensive empirical study on on-policy distillation for large language models, identifying failure mechanisms like distribution mismatch and optimization instability, and proposing fixes such as stop-gradient objectives and RLVR-adapted teachers.
A Systematic Study of Training-Free Methods for Trustworthy Large Language Models
A systematic study evaluating training-free methods for improving trustworthiness in large language models, categorizing approaches into input, internal, and output-level interventions while analyzing trade-offs between trustworthiness, utility, and robustness.
Mid-Training with Self-Generated Data Improves Reinforcement Learning in Language Models
This paper investigates how using diverse self-generated data during mid-training improves the effectiveness of Reinforcement Learning in Large Language Models, particularly for reasoning tasks.