ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents

Papers with Code Trending 04/13/26, 12:00 AM Tools

gui-agents open-source reinforcement-learning framework mobile-ai evaluation-benchmarks

Summary

ClawGUI is an open-source framework for training, evaluating, and deploying GUI agents using reinforcement learning, featuring standardized benchmarks and cross-platform deployment to Android, iOS, and HarmonyOS.

GUI agents drive applications through their visual interfaces instead of programmatic APIs, interacting with arbitrary software via taps, swipes, and keystrokes, reaching a long tail of applications that CLI-based agents cannot. Yet progress in this area is bottlenecked less by modeling capacity than by the absence of a coherent full-stack infrastructure: online RL training suffers from environment instability and closed pipelines, evaluation protocols drift silently across works, and trained agents rarely reach real users on real devices. We present ClawGUI, an open-source framework addressing these three gaps within a single harness. ClawGUI-RL provides the first open-source GUI agent RL infrastructure with validated support for both parallel virtual environments and real physical devices, integrating GiGPO with a Process Reward Model for dense step-level supervision. ClawGUI-Eval enforces a fully standardized evaluation pipeline across 6 benchmarks and 11+ models, achieving 95.8\% reproduction against official baselines. ClawGUI-Agent brings trained agents to Android, HarmonyOS, and iOS through 12+ chat platforms with hybrid CLI-GUI control and persistent personalized memory. Trained end to end within this pipeline, ClawGUI-2B achieves 17.1\% Success Rate on MobileWorld GUI-Only, outperforming the same-scale MAI-UI-2B baseline by 6.0\%.

Original Article

View Cached Full Text

Cached at: 05/08/26, 09:06 AM

Paper page - ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents

Source: https://huggingface.co/papers/2604.11784

Abstract

ClawGUI presents an open-source framework that addresses key challenges in GUI agent development through unified reinforcement learning, standardized evaluation, and cross-platform deployment capabilities.

GUI agentsdrive applications through their visual interfaces instead of programmatic APIs, interacting with arbitrary software via taps, swipes, and keystrokes, reaching a long tail of applications that CLI-based agents cannot. Yet progress in this area is bottlenecked less by modeling capacity than by the absence of a coherent full-stack infrastructure: online RL training suffers fromenvironment instabilityandclosed pipelines,evaluation protocolsdrift silently across works, and trained agents rarely reach real users on real devices. We present ClawGUI, an open-source framework addressing these three gaps within a single harness. ClawGUI-RL provides the first open-source GUI agent RL infrastructure with validated support for both parallel virtual environments and real physical devices, integrating GiGPO with a Process Reward Model for dense step-level supervision. ClawGUI-Eval enforces a fully standardized evaluation pipeline across 6 benchmarks and 11+ models, achieving 95.8\% reproduction against official baselines. ClawGUI-Agent brings trained agents to Android, HarmonyOS, and iOS through 12+ chat platforms withhybrid CLI-GUI controland persistent personalized memory. Trained end to end within this pipeline, ClawGUI-2B achieves 17.1\%Success Rateon MobileWorld GUI-Only, outperforming the same-scale MAI-UI-2B baseline by 6.0\%.

View arXiv page View PDF Project page GitHub1.12k Add to collection

Get this paper in your agent:

hf papers read 2604\.11784

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2604.11784 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2604.11784 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2604.11784 in a Space README.md to link it from this page.

Collections including this paper6

Browse 6 collections that include this paper

ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents

Paper page - ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper6

Similar Articles

VisualClaw: A Real-Time, Personalized Agent for the Physical World

OpenClaw has outgrown chat, hear me out

I built a multi-agent platform on top of OpenClaw — 72 specialized agents, each with their own domain, all connected through ClawSwarm

ClawEnvKit: Automatic Environment Generation for Claw-Like Agents

ClawForge: Generating Executable Interactive Benchmarks for Command-Line Agents

Submit Feedback

Similar Articles

VisualClaw: A Real-Time, Personalized Agent for the Physical World

OpenClaw has outgrown chat, hear me out

I built a multi-agent platform on top of OpenClaw — 72 specialized agents, each with their own domain, all connected through ClawSwarm

ClawEnvKit: Automatic Environment Generation for Claw-Like Agents

ClawForge: Generating Executable Interactive Benchmarks for Command-Line Agents