ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents
Summary
ClawGUI is an open-source framework for training, evaluating, and deploying GUI agents using reinforcement learning, featuring standardized benchmarks and cross-platform deployment to Android, iOS, and HarmonyOS.
View Cached Full Text
Cached at: 05/08/26, 09:06 AM
Paper page - ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents
Source: https://huggingface.co/papers/2604.11784
Abstract
ClawGUI presents an open-source framework that addresses key challenges in GUI agent development through unified reinforcement learning, standardized evaluation, and cross-platform deployment capabilities.
GUI agentsdrive applications through their visual interfaces instead of programmatic APIs, interacting with arbitrary software via taps, swipes, and keystrokes, reaching a long tail of applications that CLI-based agents cannot. Yet progress in this area is bottlenecked less by modeling capacity than by the absence of a coherent full-stack infrastructure: online RL training suffers fromenvironment instabilityandclosed pipelines,evaluation protocolsdrift silently across works, and trained agents rarely reach real users on real devices. We present ClawGUI, an open-source framework addressing these three gaps within a single harness. ClawGUI-RL provides the first open-source GUI agent RL infrastructure with validated support for both parallel virtual environments and real physical devices, integrating GiGPO with a Process Reward Model for dense step-level supervision. ClawGUI-Eval enforces a fully standardized evaluation pipeline across 6 benchmarks and 11+ models, achieving 95.8\% reproduction against official baselines. ClawGUI-Agent brings trained agents to Android, HarmonyOS, and iOS through 12+ chat platforms withhybrid CLI-GUI controland persistent personalized memory. Trained end to end within this pipeline, ClawGUI-2B achieves 17.1\%Success Rateon MobileWorld GUI-Only, outperforming the same-scale MAI-UI-2B baseline by 6.0\%.
View arXiv pageView PDFProject pageGitHub1.12kAdd to collection
Get this paper in your agent:
hf papers read 2604\.11784
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2604.11784 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2604.11784 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2604.11784 in a Space README.md to link it from this page.
Collections including this paper6
Similar Articles
OpenClaw has outgrown chat, hear me out
The author discusses the limitations of managing AI agent workflows via chat interfaces like Telegram with OpenClaw, advocating for dedicated dashboards and standardized UIs. They highlight emerging tools like Paperclip and Multica that aim to solve agent management issues.
ClawEnvKit: Automatic Environment Generation for Claw-Like Agents
ClawEnvKit is an automated pipeline that generates diverse, verified environments for claw-like agents from natural language descriptions, enabling the construction of Auto-ClawEval, a large-scale benchmark with 1,040 environments at 13,800x lower cost than human curation. The system supports continuous, on-demand evaluation and adaptive training environment generation across multiple model families and agent frameworks.
We turned Cursor.ai into an OpenClaw-style multi-agent control panel
Developers built an open-source web UI on top of the Cursor CLI that turns it into a multi-agent control panel, allowing users to run multiple Cursor agent sessions with separate workspaces, scheduling, and MCP config management from a browser-based cockpit.
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning
UI-TARS-2 is a native GUI-centered agent model that addresses data scalability, multi-turn RL, and environment stability challenges, achieving state-of-the-art results on GUI benchmarks (88.2 on Online-Mind2Web, 47.5 on OSWorld, 50.6 on WindowsAgentArena,73.3 on AndroidWorld) and outperforming Claude and OpenAI agents.
@garrytan: Clawvisor is going to be one of the most important parts of helping make the agent world especially OpenClaw/Hermes Age…
Garry Tan highlights Clawvisor as a key tool for making AI agent frameworks like OpenClaw/Hermes Agent secure and enterprise-ready, comparing the current AI moment to the Apple I era on the cusp of broader adoption.