Tag
This paper identifies 'state inertia' in full-duplex spoken language models, where the model's internal predictive focus lags during user interruptions, and proposes a training-free activation steering method to improve interruption handling.
This paper introduces InterRS, a method for real-time speech generation that interleaves reasoning steps during natural pauses in speech, achieving better performance on math and logic benchmarks while maintaining fluent and instant responses.