Tag
ABACUS is a unified vision-language model that handles multiple counting tasks and count-faithful image generation without benchmark-specific training, achieving state-of-the-art results across seven benchmarks.