Tag
This paper investigates whether factual recall mechanisms learned in text-based language models transfer to speech modalities in multimodal speech-language models. Using causal mediation analysis on SpiritLM, it finds that the mechanisms are only partially carried over, highlighting differences between text and speech processing.