semantic-understanding

#semantic-understanding

When Similar Means Different: Evaluating LLMs on Arabic--Hebrew Cognates

arXiv cs.CL ↗ · 2026-06-12 Cached

This paper introduces SemCog Bench, a curated benchmark of 1,858 Arabic-Hebrew word pairs with sentence-level annotations, to evaluate LLMs' ability to distinguish true cognates from false friends and loanwords. Results show high accuracy on true cognates but sharp drops on false friends, highlighting a key limitation in cross-lingual semantic reasoning.

0 favorites 0 likes

#semantic-understanding

Safety-Oriented Evaluation of Language Understanding Systems for Air Traffic Control

arXiv cs.CL ↗ · 2026-05-13 Cached

This paper proposes a safety-oriented, consequence-aware evaluation framework for large language models in Air Traffic Control, revealing that high aggregate accuracy masks significant reliability issues in handling high-risk semantic errors.

0 favorites 0 likes

semantic-understanding

When Similar Means Different: Evaluating LLMs on Arabic--Hebrew Cognates

Safety-Oriented Evaluation of Language Understanding Systems for Air Traffic Control

Submit Feedback