Tag
A 4B open-source model beats Mythos 5 on the CharXiv chart understanding benchmark, showing strong performance from a freely available small model.
MIT researchers developed ChartNet, a dataset of over a million charts, to train vision-language models to interpret charts more accurately. Their open-source models outperform much larger commercial models on chart understanding tasks.
HakushoBench is a Japanese chart and table VQA benchmark built from governmental white papers to evaluate vision-language models' understanding of complex visual data, challenging open-weight models with a 58.6% accuracy and a 34.9-point gap to proprietary models.