view-planning

Tag

Cards List
#view-planning

@ManlingLi_: Planning with the views: Can VLMs predict how each camera move changes the view, and plan many such moves ahead? We int…

X AI KOLs Following · yesterday Cached

Introduces ViewSuite, a benchmark with 6DoF camera control and ~165K tasks for evaluating VLMs' ability to plan camera moves. Finds a planning gap where models can track but not compose plans, and proposes View Graph Distillation (RL-Graph-SFT) to improve success from 2.5% to 47.8%.

0 favorites 0 likes
← Back to home

Submit Feedback