interaction-trajectories

Tag

Cards List
#interaction-trajectories

Video2GUI: Synthesizing Large-Scale Interaction Trajectories for Generalized GUI Agent Pretraining

arXiv cs.CL · 2026-05-15 Cached

Proposes Video2GUI, a framework to automatically extract GUI interaction trajectories from unlabeled instructional videos, building WildGUI dataset with 12M trajectories across 1500+ apps. Pre-training on this data yields 5-20% improvements on GUI grounding and action benchmarks.

0 favorites 0 likes
← Back to home

Submit Feedback