engineering-reasoning

Tag

Cards List
#engineering-reasoning

Do VLMs Reason Like Engineers? A Benchmark and a Stage-wise Evaluation

arXiv cs.AI · yesterday Cached

This paper introduces EngVQA, a multimodal benchmark for evaluating engineering reasoning in vision-language models, along with an 8-stage automatic evaluation framework that enables fine-grained analysis of reasoning failures. It reveals substantial limitations in current VLMs' engineering reasoning capabilities.

0 favorites 0 likes
← Back to home

Submit Feedback