offline-trajectories

Tag

Cards List
#offline-trajectories

DRIFT: Decoupled Rollouts and Importance-Weighted Fine-Tuning for Efficient Multi-Turn Optimization

Hugging Face Daily Papers · 6d ago Cached

This paper proposes DRIFT, a framework that combines offline trajectories with importance-weighted supervised fine-tuning to efficiently achieve multi-turn interactive learning performance comparable to reinforcement learning.

0 favorites 0 likes
← Back to home

Submit Feedback