multimodal-qa

Tag

Cards List
#multimodal-qa

PhysBrain 1.0 Technical Report

Hugging Face Daily Papers · 2026-05-14 Cached

PhysBrain 1.0 is a technical report presenting a method that uses human egocentric video to generate physical commonsense supervision for vision-language-action models, achieving state-of-the-art results on embodied control benchmarks including ERQA, PhysBench, SimplerEnv-WidowX, LIBERO, and RoboCasa.

0 favorites 0 likes
← Back to home

Submit Feedback