teacher-student

#teacher-student

@TheTuringPost: https://x.com/TheTuringPost/status/2068474648925216861

X AI KOLs Timeline ↗ · 4d ago Cached

An educational overview of knowledge distillation, covering its history, core concepts like softmax and temperature, types, scaling laws, and practical examples including DeepSeek-R1.

0 favorites 0 likes

#teacher-student

SG-OPD: Sign-Gated On-Policy Distillation via Sign-Consistency Gating and Phased Teacher Sampling

Hugging Face Daily Papers ↗ · 2026-06-08 Cached

Sign-Gated On-Policy Distillation (SG-OPD) enhances standard on-policy distillation by using a binary verifier as a trust signal for teacher supervision, improving performance on competition-level math reasoning benchmarks.

0 favorites 0 likes

#teacher-student

Beyond Scalar Rewards by Internalizing Reasoning into Score Distributions

Hugging Face Daily Papers ↗ · 2026-06-08 Cached

Z-Reward is a teacher-student framework that decouples complex reasoning from efficient reward deployment for text-to-image training. It achieves 89.6% human preference accuracy with a 27B teacher and 88.6% with a 9B student, outperforming prior methods.

0 favorites 0 likes

#teacher-student

Prompt-Level Distillation: A Non-Parametric Alternative to Model Fine-Tuning for Efficient Reasoning

Hugging Face Daily Papers ↗ · 2026-06-02 Cached

Prompt-Level Distillation (PLD) extracts reasoning patterns from teacher models into structured instructions for student model system prompts, improving performance on reasoning tasks without fine-tuning overhead.

0 favorites 0 likes

#teacher-student

Distribution Corrected Offline Data Distillation for Large Language Models

arXiv cs.CL ↗ · 2026-05-15 Cached

This paper proposes a principled offline reasoning distillation framework that corrects teacher-student distribution drift, improving reasoning accuracy on math benchmarks without requiring online rollouts.

0 favorites 0 likes

#teacher-student

How to Fine-Tune a Reasoning Model? A Teacher-Student Cooperation Framework to Synthesize Student-Consistent SFT Data

Hugging Face Daily Papers ↗ · 2026-03-23 Cached

This paper introduces TESSY, a teacher-student cooperative framework for fine-tuning reasoning models that generates on-policy SFT data by decoupling generation into capability tokens (from teacher) and style tokens (from student), addressing catastrophic forgetting issues when using off-policy teacher data.

0 favorites 0 likes

#teacher-student

Teacher–student curriculum learning

OpenAI Blog ↗ · 2017-07-01 Cached

OpenAI proposes Teacher–Student Curriculum Learning (TSCL), a framework where a Teacher algorithm automatically selects subtasks for a Student to learn complex tasks, optimizing based on learning curve slope and preventing forgetting. The approach matches or surpasses hand-crafted curricula on decimal addition and Minecraft navigation tasks, enabling solutions previously impossible with direct training.

0 favorites 0 likes

teacher-student

@TheTuringPost: https://x.com/TheTuringPost/status/2068474648925216861

SG-OPD: Sign-Gated On-Policy Distillation via Sign-Consistency Gating and Phased Teacher Sampling

Beyond Scalar Rewards by Internalizing Reasoning into Score Distributions

Prompt-Level Distillation: A Non-Parametric Alternative to Model Fine-Tuning for Efficient Reasoning

Distribution Corrected Offline Data Distillation for Large Language Models

How to Fine-Tune a Reasoning Model? A Teacher-Student Cooperation Framework to Synthesize Student-Consistent SFT Data

Teacher–student curriculum learning

Submit Feedback