@gregpr07: Browser Use Beta just achieved SOTA on our hardest internal web agent benchmark. Fable is genuinely amazing for optimiz…
Summary
Browser Use Beta achieved state-of-the-art results on a difficult internal web agent benchmark, using Fable for optimization and analysis.
View Cached Full Text
Cached at: 06/12/26, 08:57 AM
Browser Use Beta just achieved SOTA on our hardest internal web agent benchmark.
Fable is genuinely amazing for optimizing and analyzing eval runs. It can find super high level heuristics of the model in the run and find WHY those edge cases happen on absolutely massive Rust codebase.
This feels next level, I have been playing with autoresearch loops for months and this is the first one that really understands stuff on the high level!
(also it’s crazy it just one shots this image haha)
Similar Articles
@rsalakhu: Congrats to the @browser_use team for taking the #1 spot on Odysseys, a highly challenging benchmark for long-horizon w…
The browser_use team achieved the #1 spot on the Odysseys benchmark, a challenging evaluation for long-horizon web agents, outperforming models like Opus 4.6 and GPT-5.4.
@ms_aifrontiers: Along with MagenticLite, we're introducing Fara1.5: a family of small browser agents at 4B, 9B, and 27B. It scores 63% …
Microsoft introduces the Fara1.5 family of small browser agents (4B, 9B, 27B) that achieve state-of-the-art performance on computer use benchmarks, scoring 63% on Online-Mind2Web and beating larger models like Operator and Gemini.
@browser_use: BrowserCode is incredibly good at long-running tasks It orders pizza for us
BrowserCode achieves #1 spot on Odysseys benchmark for long-horizon web agents, demonstrating strong performance in multi-hour web workflows.
@browser_use: Introducing Browser Use 0.13.0 [beta] > The old Browser Use was built for GPT-4. > This one was built for SOTA models. …
Browser Use 0.13.0 is a complete rewrite in Rust, providing custom LLM and browser harnesses optimized for state-of-the-art models, replacing the previous GPT-4-centric version.
@browser_use: Introducing B, a browser agent template! Built on Eve by @vercel. Give any agent a real Browser Use Cloud browser. Watc…
Introducing B, an open-source browser agent template built on Eve by Vercel that uses Browser Use Cloud to give any AI agent a real web browser. It includes a chat UI and live browser viewing.