Tag
This paper proposes a practical evaluation method for long-form simultaneous speech-to-speech translation that uses ASR, forced alignment, and sentence embedding alignment to compute latency and quality metrics on continuous speech, overcoming limitations of prior approaches.