Tag
This paper proposes LEAF, a retrospective tree-based reinforcement learning method for speech-aware large language model post-training that improves credit assignment without online branching. LEAF outperforms GRPO on speech question answering and speech translation benchmarks.