cognitive-evaluation

#cognitive-evaluation

Almieyar-Oryx-BloomBench: A Bilingual Multimodal Benchmark for Cognitively Informed Evaluation of Vision-Language Models

Hugging Face Daily Papers ↗ · 2026-06-04 Cached

BloomBench is a cognitively grounded bilingual (English-Arabic) multimodal benchmark for Vision-Language Models, systematically evaluating six cognitive levels based on Bloom's Taxonomy. Experiments reveal significant cognitive asymmetries and cross-lingual performance gaps in current models.

0 favorites 0 likes

#cognitive-evaluation

Uneven Evolution of Cognition Across Generations of Generative AI Models

Reddit r/singularity ↗ · 2026-05-11 Cached

This paper introduces a psychometric framework and the AIQ Benchmark to evaluate the cognitive profiles of generative AI models, revealing uneven evolution with strong verbal skills but stagnant perceptual reasoning.

0 favorites 0 likes

cognitive-evaluation

Almieyar-Oryx-BloomBench: A Bilingual Multimodal Benchmark for Cognitively Informed Evaluation of Vision-Language Models

Uneven Evolution of Cognition Across Generations of Generative AI Models

Submit Feedback