interactive-rl

#interactive-rl

From Static Context to Calibrated Interactive RL: Mitigating Distribution Shift in Multi-turn Dialogue with Aligned Simulator

arXiv cs.AI ↗ · 2026-05-27 Cached

This paper theoretically identifies and mitigates context distribution shift in multi-turn dialogue RL, proposing Calibrated Interactive RL that couples interactive RL with simulator alignment to reduce the sim-to-real gap and achieve state-of-the-art performance.

0 favorites 0 likes

interactive-rl

From Static Context to Calibrated Interactive RL: Mitigating Distribution Shift in Multi-turn Dialogue with Aligned Simulator

Submit Feedback