Tag
PRISM is a multi-agent framework that decouples speech perception, response generation, and speech synthesis to improve empathetic spoken dialogue by integrating prosodic cues with LLM reasoning and external knowledge tools.
This paper presents a systematic study on integrating experiential knowledge into LLM tool calling, proposing the KATE framework that combines knowledge-augmented data, width-expanded inference, and knowledge-aware training, achieving consistent improvements on BFCL-V3 and AppWorld benchmarks.