Tag
Introduces Teach VLM, a model that extracts step-by-step operational knowledge from mobile screen demonstrations, and the Teach-and-Repeat paradigm that uses this knowledge to guide GUI agents, achieving state-of-the-art performance on a new benchmark.