ar-to-dlm

Tag

Cards List
#ar-to-dlm

Data-Efficient Autoregressive-to-Diffusion Language Models via On-Policy Distillation

arXiv cs.CL · 4d ago Cached

The paper introduces OPDLM, a method that transforms autoregressive language models into diffusion language models via on-policy distillation, requiring 15x to 7000x fewer training tokens while retaining knowledge from the original model.

0 favorites 0 likes
← Back to home

Submit Feedback