self-distillation

#self-distillation

@QingQ77: Collecting open-source code and papers on On-Policy Distillation and Self-Distillation for training LLMs/VLMs/Agents, tagged by four dimensions: teacher source, supervision signal, rollout usage, and training stage. https://g…

X AI KOLs Timeline ↗ · yesterday Cached

Introducing AwesomeOPD, a curated list of open-source code and papers related to On-Policy Distillation (OPD) and Self-Distillation used in the training of LLMs, VLMs, and Agents. Resources in this list are meticulously categorized and tagged based on teacher source, supervision signal, rollout usage, and training stage.

0 favorites 0 likes

#self-distillation

D-OPSD: On-Policy Self-Distillation for Continuously Tuning Step-Distilled Diffusion Models

Hugging Face Daily Papers ↗ · 4d ago Cached

This paper introduces D-OPSD, a novel training paradigm for step-distilled diffusion models that enables on-policy self-distillation during supervised fine-tuning. It allows models to learn new concepts or styles without compromising their efficient few-step inference capabilities.

0 favorites 0 likes

#self-distillation

Self-Distillation as a Performance Recovery Mechanism for LLMs: Counteracting Compression and Catastrophic Forgetting

arXiv cs.CL ↗ · 2026-04-20 Cached

This paper introduces Self-Distillation Fine-Tuning (SDFT) as a recovery mechanism for LLMs suffering from performance degradation due to catastrophic forgetting, quantization, and pruning. The authors provide theoretical justification using Centered Kernel Alignment (CKA) to demonstrate that self-distillation aligns the student model's high-dimensional manifold with the teacher's optimal structure, effectively recovering lost capabilities.

0 favorites 0 likes

#self-distillation

Why Fine-Tuning Encourages Hallucinations and How to Fix It

arXiv cs.CL ↗ · 2026-04-20 Cached

This paper investigates how supervised fine-tuning (SFT) increases hallucinations in LLMs by causing knowledge degradation and proposes a self-distillation-based method to mitigate this issue while preserving pre-existing factual knowledge. The authors identify semantic interference among overlapping representations as the primary mechanism behind SFT-induced hallucinations and demonstrate solutions including parameter freezing and self-distillation.

0 favorites 0 likes

#self-distillation

MARCO: Navigating the Unseen Space of Semantic Correspondence

Hugging Face Daily Papers ↗ · 2026-04-20 Cached

MARCO introduces a compact, fast model for semantic correspondence that achieves state-of-the-art accuracy and generalization to unseen keypoints using a coarse-to-fine objective and self-distillation framework with DINOv2.

0 favorites 0 likes

#self-distillation

Self-Distillation Zero: Self-Revision Turns Binary Rewards into Dense Supervision

Hugging Face Daily Papers ↗ · 2026-04-13 Cached

Self-Distillation Zero (SD-Zero) is a novel training method that converts sparse binary rewards into dense token-level supervision through dual-role training where a model acts as both generator and reviser, achieving 10%+ improvements on math and code reasoning benchmarks with higher sample efficiency than RL approaches.

0 favorites 0 likes

self-distillation

@QingQ77: Collecting open-source code and papers on On-Policy Distillation and Self-Distillation for training LLMs/VLMs/Agents, tagged by four dimensions: teacher source, supervision signal, rollout usage, and training stage. https://g…

D-OPSD: On-Policy Self-Distillation for Continuously Tuning Step-Distilled Diffusion Models

Self-Distillation as a Performance Recovery Mechanism for LLMs: Counteracting Compression and Catastrophic Forgetting

Why Fine-Tuning Encourages Hallucinations and How to Fix It

MARCO: Navigating the Unseen Space of Semantic Correspondence

Self-Distillation Zero: Self-Revision Turns Binary Rewards into Dense Supervision

Submit Feedback