Tag
This paper introduces LEVANTE-bench, a benchmark that systematically evaluates vision-language models on six cognitive tasks and compares their performance to children aged 5-12, finding that current VLMs align only partially with children's cognitive abilities.