mllms

Tag

Cards List
#mllms

SynCred-Bench: Benchmarking Synthetic Credibility in AI-Generated Visual Misinformation

Hugging Face Daily Papers · 2d ago Cached

Introduces SynCred-Bench, a benchmark of 600 AI-generated misinformation images across six credible-form categories, showing that existing detectors (including MLLMs, open-source AIGC detectors, and commercial APIs) perform poorly, with human annotators also struggling.

0 favorites 0 likes
#mllms

ESI-Bench: Towards Embodied Spatial Intelligence that Closes the Perception-Action Loop

Hugging Face Daily Papers · 2026-05-18 Cached

Introduces ESI-BENCH, a comprehensive benchmark for embodied spatial intelligence built on OmniGibson, covering 10 task categories and 29 subcategories. Experiments show active exploration substantially outperforms passive approaches, with failures mainly due to action blindness rather than perception, revealing a metacognitive gap in models compared to humans.

0 favorites 0 likes
← Back to home

Submit Feedback