@GitHub_Daily: 让 AI Agent 自动化操作浏览器或抓数据，经常被各种反爬机制拦截，遇到验证码、人机验证直接卡死。最近 BrowserAct 团队开源了一个 Skill，专为 AI Agent 设计的浏览器自动化命令行工具。提供三层反封锁机制，从…

X AI KOLs Timeline 2026/06/05 04:00 工具

browser-automation ai-agent anti-bot open-source cli-tool captcha-solving web-scraping

摘要

BrowserAct 团队开源了一个专为 AI Agent 设计的浏览器自动化命令行工具，提供三层反封锁机制（指纹伪装、验证码破解、人类接管），支持多浏览器并行、账户隔离，并优化了输出格式以节省Token。

让 AI Agent 自动化操作浏览器或抓数据，经常被各种反爬机制拦截，遇到验证码、人机验证直接卡死。最近 BrowserAct 团队开源了一个 Skill，专为 AI Agent 设计的浏览器自动化命令行工具。提供三层反封锁机制，从指纹伪装、验证码自动破解，到 AI 搞不定时生成一个链接让人类随时接管，整套流程衔接得很顺畅。 GitHub：http://github.com/browser-act/skills… 还可以多浏览器并行时，每个任务的 Cookie、指纹、代理完全隔离，网站没法关联不同账号。另外输出格式也专门为大模型优化过，比传统的 HTML 或 JSON 省好几倍 Token。还附带一个 Skill Forge 功能，让 AI 自动探索网站结构并生成可复用的抓取脚本，之后批量跑数据不用重新探索。如果你在用 Claude Code、Cursor 这类工具做浏览器自动化经常被反爬，这个项目值得试试。

查看原文

查看缓存全文

缓存时间: 2026/06/05 07:10

让 AI Agent 自动化操作浏览器或抓数据，经常被各种反爬机制拦截，遇到验证码、人机验证直接卡死。

最近 BrowserAct 团队开源了一个 Skill，专为 AI Agent 设计的浏览器自动化命令行工具。

提供三层反封锁机制，从指纹伪装、验证码自动破解，到 AI 搞不定时生成一个链接让人类随时接管，整套流程衔接得很顺畅。

GitHub：http://github.com/browser-act/skills…

还可以多浏览器并行时，每个任务的 Cookie、指纹、代理完全隔离，网站没法关联不同账号。

另外输出格式也专门为大模型优化过，比传统的 HTML 或 JSON 省好几倍 Token。

还附带一个 Skill Forge 功能，让 AI 自动探索网站结构并生成可复用的抓取脚本，之后批量跑数据不用重新探索。

如果你在用 Claude Code、Cursor 这类工具做浏览器自动化经常被反爬，这个项目值得试试。

browser-act/skills

Source: https://github.com/browser-act/skills

BrowserAct Skills

Browser automation CLI built for AI agents. Get past anti-bot walls, hand off to humans across platforms when stuck, run parallel tasks without cross-contamination, and isolate multiple accounts in independent browsers.

Why BrowserAct

The browser an AI agent needs has to reach places standard tools can’t, let a human seamlessly take over when the agent is stuck, keep parallel tasks from cross-contaminating, and be designed for LLM reasoning — not human-written scripts. A browser for agents must get four things right.

1. Break through blocks — three progressive layers

Environment layer — stealth fingerprint spoofing, TLS rotation, proxy switching. The vast majority of blocks never trigger.
Execution layer — solve-captcha auto-solves CAPTCHAs; stealth-extract pulls protected pages in one command.
Human layer — remote-assist generates a live URL; the user takes over from any device, and the agent continues seamlessly when done.

2. Three browser modes — by real-world scenario

Mode	Scenario	Key trait
`chrome`	Reuse local Chrome login state	Profile import or CDP attach
`stealth` privacy mode	Frictionless batch scraping without login	Fresh fingerprint per session + proxy rotation, zero residue
`stealth` fixed identity	Logged-in accounts · multi-browser parallel	Stable fingerprint + stable IP, stable account identity, not flagged as bots

3. Zero-interference concurrency — every agent in its own lane

Cross-browser parallel — independent cookies, fingerprints, proxies. Sites cannot correlate them.
Same-browser multi-session — shared login state, independent execution, tasks don’t block each other.
Privacy mode — fresh fingerprint and empty profile per session, zero residue when done.

4. Designed for agent reasoning — not human scripts

Compact text output — indexed text format, several times more token-efficient than JSON or HTML.
Indexed interaction — state returns an indexed list; click 3 / input 2 "...". No DOM parsing required.
Semantic memory — every browser carries a desc, matched to tasks by meaning.
Concurrency-safe — session ownership + explicit naming. Multi-agent operation never conflicts.

Security: confirmation gating — sensitive operations (browser create / delete, Profile import, proxy changes, security and privacy toggles) require explicit user approval. Prior approvals do not carry over. Enforced at the Skill layer, not a configuration toggle.

And More

Better headless — Default headless without disrupting users; stealth headless that isn’t detected.
Cross-platform remote handoff — Any device opens the link to take over, and the agent continues seamlessly.

Install

Tell your AI agent:

Install browser-act. Skill source: https://github.com/browser-act/skills/tree/main/browser-act . Verify it works after installation.

Installation details →

Quick Start

# Extract protected page content (zero config)
browser-act stealth-extract https://example.com

# Full browser automation
browser-act --session my-task browser open <id> https://example.com
browser-act --session my-task state          # See clickable elements
browser-act --session my-task click 3        # Click by index
browser-act --session my-task input 2 "hi"   # Type into a field

More examples and workflows →

The agent runs get-skills at the start of each session — gets environment state, browser list, and commands in one call:

browser-act get-skills core --skill-version 2.0.2

How agents discover and use BrowserAct →

Compatibility

OS: Windows, macOS, Linux

Agents: Claude Code · Cursor · VS Code · OpenCode · OpenClaw · Codex · Gemini CLI — works with any agent that can execute shell commands and load Skills.

Documentation

Full documentation covers anti-blocking, browser modes, sessions and concurrency, headless and remote handoff, agent design, the Skills system, and the complete command reference.

Read the full documentation →

Also From BrowserAct

Skill Forge — Your Personal Scraping Engineer

Need to extract data from the same website repeatedly at scale? Don’t write scrapers by hand. Skill Forge explores a site once, discovers its APIs and data patterns, generates a deploy-ready Skill package, then runs reliably without re-exploration — 500 or 5,000 records through the same stable path.

Any website. Any data. One command to start:

Install browser-act-skill-forge. Skill source: https://github.com/browser-act/skills/tree/main/browser-act-skill-forge . Verify it works after installation.

Then tell your agent what you need:

“Forge a Skill that extracts job listings from LinkedIn — title, company, salary, URL. I’ll run 300 keywords later.”

Skill Forge documentation →

Solutions Catalog

30+ pre-built Skills already generated by Skill Forge, ready to install and run. Covers Amazon, Google Maps, YouTube, Reddit, WeChat, Zhihu, and more.

Browse the full Solutions Catalog →

Build Your Own

Can’t find what you need above? Generate a custom Skill for any website in minutes — no coding required. Just describe what data you want or what action to perform, and Skill Forge handles the rest.

💖 Support the Project

BrowserAct Skills is free and open source. If it saves you time, please give us a ⭐ Star — it keeps the project alive and helps us ship more skills.

🎁 Bonus: Once you star the repository, you can join our Discord and post in the #claim-500-credits channel to receive 500 free credits!

🤝 Community & Support

Built with ❤️ by the BrowserAct Team

相似文章

BrowserAct

Product Hunt

BrowserAct是一款为AI代理设计的网页浏览器自动化工具，可实现自动化的网页交互。

@Jason23818126: 如果你经常用 AI 找信息，这个网站一定要收藏 Hermes 现在已经可以接入 http://Browse.sh 这是一个收录了数百个浏览器 Skill 的开源目录，一条 CLI 命令就能让 Agent 获得新的互联网能力我翻了一圈，覆…

X AI KOLs Timeline

Browse.sh 是一个数百个浏览器 Skill 的开源目录，通过一条 CLI 命令即可让 AI Agent 获得新的互联网能力，覆盖找房、航班、电影、工作等场景。

@SunNeverSetsX: 分享我珍藏的 4 个让 Agent Loop 跑得更快、更好的基础设施，一定要让你的 Agent 用上！ 1. http://agent-browser.dev 让 Agent 在真实浏览器里点击、验证 UI 变更，实现真正的自闭环测试 …

X AI KOLs Timeline

agent-browser is a CLI tool for browser automation designed for AI agents, using compact text output and ref-based element selection to minimize token usage. The post also highlights three other tools—portless, emulate, and ai-cli—for improving agent loop efficiency.

@quant_sheep: 我让 Agent 通过 Chrome 帮我在 Airbnb 上找住处并完成预约了它甚至还主动帮我咨询房东：有没有厨房如果你需要让自己的 Agent 像人一样操作浏览器，不管是测试网页，还是自动预订 Airbnb 住宿，涉及网页的操作都…

X AI KOLs Timeline

展示了一个名为 open-browser-use 的开源工具，能让 AI Agent 像人一样操作 Chrome 浏览器，完成在 Airbnb 上查找住处并预约的全流程，包括主动咨询房东。

@Jolyne_AI: 开源 AI 网页自动化工具：Nanobrowser。 OpenAI Operator 的开源替代方案，本地在浏览器里运行，支持多智能体协作。免费、重视隐私、LLM 选择灵活、代码完全开源，让网页操作更智能、更高效。 GitHub：htt…

X AI KOLs Timeline

Nanobrowser 是一个开源 AI 网页自动化工具，作为 OpenAI Operator 的免费替代方案，在本地浏览器中运行，支持多智能体协作，注重隐私且 LLM 选择灵活。

browser-act/skills

BrowserAct Skills

Why BrowserAct

And More

Install

Quick Start

Compatibility

Documentation

Also From BrowserAct

Skill Forge — Your Personal Scraping Engineer

Solutions Catalog

Build Your Own

💖 Support the Project

🤝 Community & Support

相似文章

BrowserAct

@Jason23818126: 如果你经常用 AI 找信息，这个网站一定要收藏 Hermes 现在已经可以接入 http://Browse.sh 这是一个收录了数百个浏览器 Skill 的开源目录，一条 CLI 命令就能让 Agent 获得新的互联网能力 我翻了一圈，覆…

@SunNeverSetsX: 分享我珍藏的 4 个让 Agent Loop 跑得更快、更好的基础设施，一定要让你的 Agent 用上！ 1. http://agent-browser.dev 让 Agent 在真实浏览器里点击、验证 UI 变更，实现真正的自闭环测试 …

@quant_sheep: 我让 Agent 通过 Chrome 帮我在 Airbnb 上找住处并完成预约了 它甚至还主动帮我咨询房东：有没有厨房 如果你需要让自己的 Agent 像人一样操作浏览器，不管是测试网页，还是自动预订 Airbnb 住宿，涉及网页的操作都…

@Jolyne_AI: 开源 AI 网页自动化工具：Nanobrowser。 OpenAI Operator 的开源替代方案，本地在浏览器里运行，支持多智能体协作。 免费、重视隐私、LLM 选择灵活、代码完全开源，让网页操作更智能、更高效。 GitHub：htt…

提交意见反馈

@Jason23818126: 如果你经常用 AI 找信息，这个网站一定要收藏 Hermes 现在已经可以接入 http://Browse.sh 这是一个收录了数百个浏览器 Skill 的开源目录，一条 CLI 命令就能让 Agent 获得新的互联网能力我翻了一圈，覆…

@quant_sheep: 我让 Agent 通过 Chrome 帮我在 Airbnb 上找住处并完成预约了它甚至还主动帮我咨询房东：有没有厨房如果你需要让自己的 Agent 像人一样操作浏览器，不管是测试网页，还是自动预订 Airbnb 住宿，涉及网页的操作都…

@Jolyne_AI: 开源 AI 网页自动化工具：Nanobrowser。 OpenAI Operator 的开源替代方案，本地在浏览器里运行，支持多智能体协作。免费、重视隐私、LLM 选择灵活、代码完全开源，让网页操作更智能、更高效。 GitHub：htt…