Tag
This paper proposes DRIFT, a framework that combines offline trajectories with importance-weighted supervised fine-tuning to efficiently achieve multi-turn interactive learning performance comparable to reinforcement learning.