Tag
Introduces Vividh-ASR, a complexity-tiered benchmark for Hindi and Malayalam ASR, identifies studio-bias in fine-tuning, and proposes R-MFT to improve spontaneous speech performance efficiently.
MUSCAT is a new multilingual, scientific conversation benchmark dataset for evaluating ASR systems on challenging multilingual scenarios including code-switching, domain-specific vocabulary, and mixed language input. The dataset consists of bilingual discussions on scientific papers between speakers using different languages, with results showing current state-of-the-art systems struggle with these multilingual challenges.