A developer built a self-contained browser-use agent that runs entirely in WASM/WebGPU at zero server cost, enabling full webpage control via natural language prompts.
The only cost is electricity! I built this in a few weeks since I couldn't find anything else like it. Demo: [https://pdufour.github.io/browser-use-wasm/](https://pdufour.github.io/browser-use-wasm/) Source Code: [https://github.com/pdufour/browser-use-wasm](https://github.com/pdufour/browser-use-wasm) One thing I've wanted to do for a while was add a widget to my page that allowed me to control the complete webpage just like any of the browser-use agents can. The key distinction is I wanted it to be fully self-contained, no serve involved. After a few weeks of tinkering I have a fairly good browser-use model running entirely via Snapdom / WASM / WebGPU / Wllama / ShowUi-2b and a little JS to tie it all together. **The browser use library I developed can handle all this:** * Typing into fields * Clicking links * Multi-turn actions (click on input, type something into it, click submit button) - all from one prompt - works 50% of the time * Changing dropdown options **Some lessons I learned making things others might find helpful:** 1. Tests are your friend, finding mind2web [https://github.com/OSU-NLP-Group/Mind2Web](https://github.com/OSU-NLP-Group/Mind2Web) and MiniWob [https://github.com/Farama-Foundation/miniwob-plusplus](https://github.com/Farama-Foundation/miniwob-plusplus) helped me continuously improve the accuracy on the browser-use actions 2. Browser use is very very hard. I've only supported a limited set of actions and even getting to that point was quite hard. To handle complex queries you need some kind of interaction loop but then you run into problems like figuring out when to end the loop. 3. Accuracy matters. For the longest time my click actions were off by a few px and I finally was able to track down the issue to the snapdom library. When a click is off by a few px that could mean its clicking in blank space rather than a button. I'm so glad this is fixed - [https://github.com/zumerlab/snapdom/issues/421](https://github.com/zumerlab/snapdom/issues/421). This code is super super alpha and a lot of stuff is probably broken but I thought I would share with Reddit to ask for feedback and see if people had any ideas on how to develop this further. I'm open to any ideas!
Introducing B, an open-source browser agent template built on Eve by Vercel that uses Browser Use Cloud to give any AI agent a real web browser. It includes a chat UI and live browser viewing.
Discusses architectural issues with current browser agents using headless Chrome + AI layer, and presents Opera Neon's CLI as an alternative where AI is integrated into the browser, reducing token overhead and improving understanding.
browser-wall is a new cloud browser interface that allows controlling multiple browsers simultaneously via a single CDP URL, with support for profiles and proxies.
Santiago (@svpino) discusses the challenges of running AI agents inside browsers, and @ego_agent announces 'ego lite,' a kernel-level rebuild aimed at making AI agents faster and more reliable.