keyframe-extraction

Tag

Cards List
#keyframe-extraction

Teach-and-Repeat: Accurately Extracting Operational Knowledge from Mobile Screen Demonstrations to Empower GUI Agents

arXiv cs.AI · 2026-06-12 Cached

Introduces Teach VLM, a model that extracts step-by-step operational knowledge from mobile screen demonstrations, and the Teach-and-Repeat paradigm that uses this knowledge to guide GUI agents, achieving state-of-the-art performance on a new benchmark.

0 favorites 0 likes
← Back to home

Submit Feedback