I gave an LLM a real browser and a goal instead of a script it fills forms and returns structured JSON

Reddit r/AI_Agents Tools

Summary

An open-source agent that uses an LLM to control a real browser to fill forms and extract structured data, requiring minimal tokens per page.

Built an open-source agent that takes intent (`"find the pricing"`, `"enrich this lead"`, `"fill this form"`) and drives a real Chrome to do it — no selectors, no predefined steps. The LLM only gets called at junctions (~1 call/page) to decide the next action or to extract, which keeps it cheap (~1,200 tokens/site). The agentic bit I'm proudest of: point it at a government records form with no API, hand it a profile JSON, and it reads the labels, maps the profile to fields, picks dropdown values, submits, and reads the results page back as JSON. Got "Page 1 of 815" off a real Maryland estate form. It also ships as an **MCP server** , so you can drop a `read_page(url, goal)` tool straight into Claude Desktop / Claude Code / any MCP client. MIT, local, your own key: Would love feedback from people building agents on the action-selection loop.
Original Article

Similar Articles

Local LLM Peeps

Reddit r/LocalLLaMA

A developer with 45 years of experience is building a local-first harness for LLMs with multi-agent logic, soon to be open-sourced on GitHub, and asks the community what features would improve their local LLM experience.

Markdown browser for LLMs

Reddit r/LocalLLaMA

The author introduces TextWeb, an open-source tool that renders web pages as markdown for LLMs instead of using expensive vision models, featuring CLI and MCP server support.