Tag
This paper introduces Audio-Interaction, a unified streaming audio model that combines offline task execution with real-time audio instruction following via an end-to-end framework. It proposes SoundFlow for the perceive-decide-respond loop and evaluates competitive performance across benchmarks.