Tag
Researchers present SemanticQA, a benchmark for evaluating language models on semantic phrase processing tasks including idioms, noun compounds, and verbal constructions, revealing significant performance variation across model architectures and scales on semantic reasoning tasks.