@browser_use: BrowserCode is incredibly good at long-running tasks It orders pizza for us
Summary
BrowserCode achieves #1 spot on Odysseys benchmark for long-horizon web agents, demonstrating strong performance in multi-hour web workflows.
View Cached Full Text
Cached at: 06/17/26, 07:48 AM
BrowserCode is incredibly good at long-running tasks
It orders pizza for us https://t.co/6c7aBxJqfL
Russ Salakhutdinov (@rsalakhu): Congrats to the @browser_use team for taking the #1 spot on Odysseys, a highly challenging benchmark for long-horizon web agents:
https://t.co/dRYnBSGsLG
Odysseys evaluates realistic, multi-hour web workflows that require sustained planning, memory, reasoning, and verification
Similar Articles
@rsalakhu: Congrats to the @browser_use team for taking the #1 spot on Odysseys, a highly challenging benchmark for long-horizon w…
The browser_use team achieved the #1 spot on the Odysseys benchmark, a challenging evaluation for long-horizon web agents, outperforming models like Opus 4.6 and GPT-5.4.
@browser_use: Here's 25 browsers starting in less than 1 second Enjoy
Browser Use launches a new browser infrastructure service featuring subsecond cold starts, lower cost at $0.02/h, and unlimited scaling, now live for developers.
@browser_use: Browser Use Terminal is here Turn your terminal into a browser agent. > Runs browser tasks from a CLI > Connects to Cod…
Browser Use Terminal is a new CLI tool that turns your terminal into a browser agent, allowing you to run browser tasks and control your real Chrome browser.
@browser_use: A guide to hosting agents as reliable APIs
A guide on how to host AI agents as reliable APIs using the browser-use framework.
@mamagnus00: Long running browser agents are here. Watch how /goal saves my father 12h of scanning eBay and FaceBook groups to find …
A demonstration of a long-running browser agent that automates searching eBay and Facebook groups to find a housekeeper, controlled via Telegram with a single prompt. Setup takes less than 2 minutes using Codex and Agency.