Tag
Vik Paruchuri announces research-driven safeguards that reduce OCR hallucinations to near-zero in their benchmark, with word-level bounding boxes and confidence scores for any remaining errors.
A novel method for multilingual word-level forced alignment combines self-supervised representations from MMS and a phoneme boundary detector with a learned dynamic programming decoder, outperforming existing aligners on English and unseen languages without further training.