Tag
This paper proposes DRIFT, a framework that combines offline trajectories with importance-weighted supervised fine-tuning to efficiently achieve multi-turn interactive learning performance comparable to reinforcement learning.
TILT introduces a novel objective for unsupervised domain adaptation under covariate shift that penalizes an auxiliary component on unlabeled target data, implicitly achieving self-localized importance weighting with bounded estimands. Theoretical guarantees and experiments on shifted CIFAR-100 show improved target performance over baselines.