multilingual-benchmark

Tag

Cards List
#multilingual-benchmark

CulturALL: Benchmarking Multilingual and Multicultural Competence of LLMs on Grounded Tasks

arXiv cs.CL · 2026-04-22 Cached

CulturALL introduces a 2,610-sample benchmark across 14 languages and 51 regions to evaluate LLMs on real-world, culturally grounded tasks; top model scores only 44.48%, highlighting large room for improvement.

0 favorites 0 likes
#multilingual-benchmark

MORPHOGEN: A Multilingual Benchmark for Evaluating Gender-Aware Morphological Generation

arXiv cs.CL · 2026-04-22 Cached

Researchers introduce MORPHOGEN, a multilingual benchmark testing LLMs’ ability to rewrite first-person sentences in the opposite gender while preserving meaning across French, Arabic, and Hindi.

0 favorites 0 likes
← Back to home

Submit Feedback