animal-knowledge

Tag

Cards List
#animal-knowledge

BAGEL: Benchmarking Animal Knowledge Expertise in Language Models

arXiv cs.CL · 2026-04-20 Cached

BAGEL is a new benchmark for evaluating animal-related knowledge in large language models, constructed from diverse scientific sources and covering taxonomy, morphology, habitat, behavior, and species interactions through closed-book question-answer pairs. The benchmark enables fine-grained analysis across taxonomic groups and knowledge categories, providing insights into model strengths and failure modes for biodiversity applications.

0 favorites 0 likes
← Back to home

Submit Feedback