Tag
This paper extends optimal transport-based hallucination detection to all decoder layers in NMT and abstractive summarization, finding that detection is concentrated in early layers and that the geometric signal transfers poorly to summarization due to faithfulness failures not detectable via attention concentration.
This paper presents the development of parallel and monolingual corpora for scientific machine translation across Spanish-English, French-English, and Portuguese-English, targeting four domains: Cancer Research, Energy Research, Neuroscience, and Transportation. The corpora are used to fine-tune neural machine translation systems, addressing challenges of specialized vocabulary and syntax in scientific text.