eyebench

Tag

Cards List
#eyebench

opus 4.8 is still very much blind - EyeBench-V3 visual benchmark (similar to IBench)

Reddit r/singularity · 2026-06-01

EyeBench-V3 visual benchmark evaluates Claude Opus 4.8, finding it still fails basic vision tasks, similar to IBench. The benchmark is introduced via a Twitter thread by Adonis Singh.

0 favorites 0 likes
← Back to home

Submit Feedback