dataset-repair

Tag

Cards List
#dataset-repair

Identifying and Resolving Pitfalls of Knowledge-Based VQA Benchmarks: Auditing, Repairing, and Augmenting

arXiv cs.CL · 2d ago Cached

This paper audits knowledge-based VQA benchmarks, revealing systematic violations of assumptions that make accuracy a misleading metric. It introduces a repair protocol and multi-entity augmentation to restore answer derivability and question clarity, showing that corrected settings yield markedly different model rankings.

0 favorites 0 likes
← Back to home

Submit Feedback