Fara-7B: An Efficient Agentic Model for Computer Use

Papers with Code Trending Papers

Summary

Introduces FaraGen, a synthetic data generation system for computer use agents, and Fara-7B, a small but efficient model that outperforms larger counterparts on web task benchmarks. The model is released open-weight on Microsoft Foundry and HuggingFace.

Progress in computer use agents (CUAs) has been constrained by the absence of large and high-quality datasets that capture how humans interact with a computer. While LLMs have thrived on abundant textual data, no comparable corpus exists for CUA trajectories. To address these gaps, we introduce FaraGen, a novel synthetic data generation system for multi-step web tasks. FaraGen can propose diverse tasks from frequently used websites, generate multiple solution attempts, and filter successful trajectories using multiple verifiers. It achieves high throughput, yield, and diversity for multi-step web tasks, producing verified trajectories at approximately $1 each. We use this data to train Fara-7B, a native CUA model that perceives the computer using only screenshots, executes actions via predicted coordinates, and is small enough to run on-device. We find that Fara-7B outperforms other CUA models of comparable size on benchmarks like WebVoyager, Online-Mind2Web, and WebTailBench -- our novel benchmark that better captures under-represented web tasks in pre-existing benchmarks. Furthermore, Fara-7B is competitive with much larger frontier models, illustrating key benefits of scalable data generation systems in advancing small efficient agentic models. We are making Fara-7B open-weight on Microsoft Foundry and HuggingFace, and we are releasing WebTailBench.
Original Article
View Cached Full Text

Cached at: 06/15/26, 12:56 AM

Paper page - Fara-7B: An Efficient Agentic Model for Computer Use

Source: https://huggingface.co/papers/2511.19663 Published on Nov 24, 2025

·

Submitted byhttps://huggingface.co/taesiri

taesirion Nov 26, 2025

Abstract

FaraGen creates synthetic datasets for computer use agents, enabling the training of efficient and high-performing models like Fara-7B on diverse web tasks, outperforming larger models on benchmarks.

Progress in computer use agents (CUAs) has been constrained by the absence of large and high-quality datasets that capture how humans interact with a computer. While LLMs have thrived on abundant textual data, no comparable corpus exists forCUA trajectories. To address these gaps, we introduceFaraGen, a novelsynthetic data generationsystem formulti-step web tasks.FaraGencan propose diverse tasks from frequently used websites, generate multiple solution attempts, and filter successful trajectories using multipleverifiers. It achieves high throughput, yield, and diversity formulti-step web tasks, producing verified trajectories at approximately $1 each. We use this data to trainFara-7B, anative CUA modelthat perceives the computer using onlyscreenshots, executes actions viapredicted coordinates, and is small enough to run on-device. We find thatFara-7Boutperforms other CUA models of comparable size on benchmarks likeWebVoyager,Online-Mind2Web, andWebTailBench-- our novel benchmark that better captures under-represented web tasks in pre-existing benchmarks. Furthermore,Fara-7Bis competitive with much larger frontier models, illustrating key benefits of scalable data generation systems in advancing small efficient agentic models. We are makingFara-7Bopen-weight on Microsoft Foundry and HuggingFace, and we are releasingWebTailBench.

View arXiv pageView PDFProject pageGitHub5.53kautoAdd to collection

Get this paper in your agent:

hf papers read 2511\.19663

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper7

#### microsoft/Fara-7B Image-Text-to-Text• 8B• Updated26 days ago • 7.06k • 610 #### AlexKitipov/Fara-7B Image-Text-to-Text• 8B• Updated15 days ago • 16 • 1 #### XythicK/microsoft_Fara-7B-GGUF Image-Text-to-Text• 8B• UpdatedDec 26, 2025 • 113 #### Prince-1/Fara-7B-Onnx Image-Text-to-Text• Updated7 days ago • 19 Browse 7 models citing this paper## Datasets citing this paper2

#### microsoft/WebTailBench Preview• UpdatedMay 12 • 291 • 16 #### Archi-001/WebTailBench Preview• Updated28 days ago • 53

Spaces citing this paper5

Collections including this paper2

Similar Articles

microsoft/Fara-7B

Hugging Face Models Trending

Microsoft released Fara-7B, an efficient 7 billion parameter agentic small language model (SLM) for computer use tasks, achieving state-of-the-art performance within its size class and competitive with larger systems.