cognitive-evaluation

Tag

Cards List
#cognitive-evaluation

Almieyar-Oryx-BloomBench: A Bilingual Multimodal Benchmark for Cognitively Informed Evaluation of Vision-Language Models

Hugging Face Daily Papers · 2026-06-04 Cached

BloomBench is a cognitively grounded bilingual (English-Arabic) multimodal benchmark for Vision-Language Models, systematically evaluating six cognitive levels based on Bloom's Taxonomy. Experiments reveal significant cognitive asymmetries and cross-lingual performance gaps in current models.

0 favorites 0 likes
#cognitive-evaluation

Uneven Evolution of Cognition Across Generations of Generative AI Models

Reddit r/singularity · 2026-05-11 Cached

This paper introduces a psychometric framework and the AIQ Benchmark to evaluate the cognitive profiles of generative AI models, revealing uneven evolution with strong verbal skills but stagnant perceptual reasoning.

0 favorites 0 likes
← Back to home

Submit Feedback