Tag
A speech company trained a model that cancels noise and identifies the primary speaker, achieving 50% lower word error rate on leading ASR models in noisy environments.
This paper critiques the use of single-reference ground truth in ASR evaluation, arguing it causes epistemic injustice for speakers with aphasia. It proposes a new metric, Epistemic Injustice Distance, and advocates for WER-Range to account for diverse transcription conventions.