Tag
WildTableBench introduces the first question-answering benchmark for real-world table images, revealing that existing multimodal foundation models struggle significantly with structural perception and numerical reasoning, with only one model exceeding 50% accuracy.