@jakevin7: OpenCLI v1.8.0 终于发布了 熬到凌晨 4 点,1.7 到 1.8 这一程算是落地了。回顾这段时间确实做了太多了! ## Browser Agent Runtime 完善了 browser 能力,一次性到位。本质是把"agen…
摘要
OpenCLI v1.8.0 发布,增强了浏览器代理运行时,通过 accessibility tree 和 CDP 原生输入实现了更可靠的浏览器操控,并新增了微信读书、12306、闲鱼等多个站点适配,提升了稳定性和安全性。
查看缓存全文
缓存时间: 2026/05/19 22:52
OpenCLI v1.8.0 终于发布了
熬到凌晨 4 点,1.7 到 1.8 这一程算是落地了。回顾这段时间确实做了太多了!
Browser Agent Runtime
完善了 browser 能力,一次性到位。本质是把“agent 操控浏览器“从靠选择器拼字符串的盲摸路线,换成「accessibility tree + 语义定位 + CDP 原生输入」的 agent-native 路线。CDP 输入 + AX 快照 + 语义定位(–role / –name / –label / –testid)+ hover / focus / dblclick / check / upload / drag / wait download / annotated screenshot
opencli browser
新站点 / 新场景
weread-official — 接入微信读书官方 Agent Gateway,纯 HTTP + Bearer key,跟原来 cookie 版的 weread 并存 12306 — 全套读取:车次 / 价格 / 我的订单 / 常用乘客 xianyu — 闲鱼 inbox / messages / reply,对话回得动了 suno — 音乐生成 linkedin — Sales Navigator + people-search + 站内信 / safe-send / thread-snapshot 全打通 linkedin-learning / rednote / booking / ctrip 酒店 + 机票 / DuckDuckGo / Brave / Yahoo 搜索
Twitter 继续打磨
list-create / device-follow / quoted_tweet / card binding_values / bio / UserMedia 游标翻页全补 写操作对称:unlike / retweet / unretweet / quote bookmarks / bookmark-folder / list-tweets 现在都带媒体
Reddit / Zhihu 读得更深
reddit:subscribed / whoami / home / subreddit-info / –expand-more 展评论楼 / listing 暴露 post_hint+url+preview+gallery zhihu:answer-detail / answer-comments / 答案翻页 (很重要)
稳定性 & 安全
download 路径穿越修复,远端字段(视频标题这种)再也不能用 ../ 越狱出输出目录
Page.goto stale-identity 自愈,CDP 报 -32000 自动重试
undici 8.x 把 Node 20 顶崩了,回退到 6.x 救回 Node 20 全量用户
youtube 字幕跨视频污染修复(SPA 切视频之前会拿到上一支的字幕)
ChatGPT 网页生图回归(现在能识别 CSS background + canvas,不只是 )
typed-error 大扫除:Douyin / Jike / WeRead / Apple Podcasts / Reddit / Gitee / lesswrong / xhs / YouTube 的 silent-sentinel 和 silent-empty 全部换成 EmptyResultError / AuthRequiredError,agent 终于不用瞎猜空数组到底是真空还是站点改版
Trace & Observation
adapter 失败现在带 trace artifact,summary.md 是诊断入口 browser console / browser network –failed / –follow,agent 终于能看见浏览器内部发生了什么
README 砍了 20%
中文 455 → 371,英文 410 → 326。Built-in Commands 收到 11 个常用站,CLI Hub 砍成名字枚举。看着不再劝退。
目标没变:AI agent 的 read & action 基建。
jackwener/OpenCLI
Source: https://github.com/jackwener/OpenCLI
OpenCLI
Convert any website into a CLI & run Browser Use on your logged-in Chrome. Turn websites, browser sessions, Electron apps, and local tools into deterministic interfaces for humans and AI agents. Or run Browser Use against any page — navigate, fill forms, click, extract, automate.
OpenCLI gives you one surface for three different kinds of automation:
- Use built-in adapters for sites like Bilibili, Zhihu, Xiaohongshu, Reddit, HackerNews, Twitter/X, and many more.
- Let AI Agents operate any website — install the
opencli-browserskill in your AI agent (Claude Code, Cursor, etc.), and it can navigate, click, type/fill, extract, and inspect any page through your logged-in browser viaopencli browserprimitives. - Write new adapters end-to-end with
opencli browser+ theopencli-adapter-authorskill, which guides from first recon through field decoding, code, andopencli browser verify.
It also works as a CLI hub for local tools such as gh, docker, longbridge, tg, discord, wx, ntn (Notion), and other binaries you register yourself, plus desktop app adapters for Electron apps like Cursor, Codex, Antigravity, and ChatGPT.
Quick Start
1. Install OpenCLI
OpenCLI requires Node.js >= 21.
node --version
npm install -g @jackwener/opencli
2. Install the Browser Bridge Extension
OpenCLI connects to Chrome/Chromium through a lightweight Browser Bridge extension plus a small local daemon. The daemon auto-starts when needed.
Option A — Chrome Web Store (recommended): Install OpenCLI from the Chrome Web Store.
Option B — Manual install:
- Download the latest
opencli-extension-v{version}.zipfrom the GitHub Releases page. - Unzip it, open
chrome://extensions, and enable Developer mode. - Click Load unpacked and select the unzipped folder.
3. Verify the setup
opencli doctor
4. Optional: name your Chrome profile
Each Chrome profile runs its own OpenCLI extension instance. If you use multiple Chrome profiles, list the connected profiles and assign local aliases:
opencli profile list
opencli profile rename <contextId> work
opencli profile use work
opencli --profile work browser state
With only one connected profile, OpenCLI uses it automatically. With multiple connected profiles and no default, OpenCLI asks you to choose instead of guessing.
5. Run your first commands
opencli list
opencli hackernews top --limit 5
opencli bilibili hot --limit 5
For Humans
Use OpenCLI directly when you want a reliable command instead of a live browser session:
opencli listshows every registered command.opencli <site> <command>runs a built-in or generated adapter.opencli external register mycliexposes a local CLI through the same discovery surface.opencli doctorhelps diagnose browser connectivity.
Extending OpenCLI
If you want to add your own commands, start with the Extending OpenCLI guide. README keeps this short; the guide covers the directory layout, source-control model, and install commands.
| Need | Recommended path |
|---|---|
| Keep personal website commands in your own Git repo | opencli plugin create + opencli plugin install file://... |
| Quickly draft a private local adapter | opencli browser init <site>/<command> in ~/.opencli/clis/ |
| Modify an official adapter locally | opencli adapter eject <site> + opencli adapter reset <site> |
| Publish or install third-party commands | opencli plugin install github:user/repo |
| Wrap an existing local binary | opencli external register <name> |
For AI Agents
OpenCLI’s browser commands are designed to be used by AI Agents — not run manually. Install skills into your AI agent (Claude Code, Cursor, etc.), and the agent operates websites on your behalf using your logged-in Chrome session.
Install skills (also refreshes existing installs)
npx skills add jackwener/opencli
Or install only what you need:
npx skills add jackwener/opencli --skill opencli-adapter-author
npx skills add jackwener/opencli --skill opencli-autofix
npx skills add jackwener/opencli --skill opencli-browser
npx skills add jackwener/opencli --skill opencli-usage
Which skill to use
| Skill | When to use | Example prompt to your AI agent |
|---|---|---|
| opencli-adapter-author | Write a reusable adapter for a new site or add a command to an existing site | “Write an adapter for douyin trending” / “Make a command that grabs the top posts from this page” |
| opencli-autofix | Repair a broken adapter when a built-in command fails | “opencli zhihu hot is returning empty — fix it” |
| opencli-browser | Drive a real Chrome page ad-hoc — navigate, fill forms, click, extract | “Help me check my Xiaohongshu notifications” / “Help me fill out this form” / “Use browser commands to scrape this page” |
| opencli-usage | Quick reference for all OpenCLI commands and sites | “What commands does OpenCLI have for Twitter?” |
How it works
Once opencli-browser is installed, your AI agent can:
- Navigate to any URL using your logged-in browser
- Read page content via structured DOM snapshots (not screenshots)
- Interact — click buttons, fill forms, select options, press keys
- Extract data from the page or intercept network API responses
- Wait for elements, text, or page transitions
The agent handles all the opencli browser commands internally — you just describe what you want done in natural language.
Skill references:
skills/opencli-browser/SKILL.md— drive Chrome ad-hoc (navigate, fill forms, click, extract)skills/opencli-adapter-author/SKILL.md— write a new adapter end-to-endskills/opencli-autofix/SKILL.md— repair broken adaptersskills/opencli-usage/SKILL.md— command and site reference
Available browser commands include open, state, click, type, fill, select, keys, wait, get, find, extract, frames, screenshot, scroll, back, eval, network, tab list, tab new, tab select, tab close, init, verify, and close.
opencli browser commands require a <session> positional immediately after browser. opencli browser work open <url> and opencli browser work tab new [url] both return a target ID. Use opencli browser work tab list to inspect target IDs, then pass --tab <targetId> to route a command to a specific tab. tab new creates a new tab without changing the default browser target; only tab select <targetId> promotes that tab to the default target for later untargeted commands in the same session.
Writing a new adapter
When the site you need is not yet covered, use the opencli-adapter-author skill end-to-end:
- Recon the site and pick a pattern (SPA / SSR / JSONP / Token / Streaming).
- Discover the right endpoint — network inspection, initial state, bundle search, token trace, or interceptor fallback.
- Pick auth —
PUBLIC/COOKIE/INTERCEPT/UI/LOCAL. - Decode response fields and design output columns.
opencli browser recon analyze <url>→opencli browser recon init <site>/<name>→ write adapter →opencli browser recon verify <site>/<name>.- Site knowledge persists to
~/.opencli/sites/<site>/so the next adapter for the same site starts from context.
Prerequisites
- Node.js: >= 21.0.0 (required for the standard npm install path)
- Bun: >= 1.0 (optional alternative runtime)
- Chrome or Chromium running and logged into the target site for browser-backed commands
Important: Browser-backed commands reuse your Chrome/Chromium login session. If you get empty data or permission-like failures, first confirm the site is already open and authenticated in Chrome/Chromium.
Configuration
| Variable | Default | Description |
|---|---|---|
OPENCLI_DAEMON_PORT | 19825 | HTTP port for the daemon-extension bridge |
OPENCLI_PROFILE | — | Browser Bridge profile alias/contextId to use when multiple Chrome profiles are connected |
OPENCLI_WINDOW | command default | Set to foreground or background to override Browser Bridge window placement. Browser-backed commands also accept --window <foreground|background>. |
OPENCLI_BROWSER_CONNECT_TIMEOUT | 30 | Seconds to wait for browser connection |
OPENCLI_BROWSER_COMMAND_TIMEOUT | 60 | Seconds to wait for a single browser command |
OPENCLI_CDP_ENDPOINT | — | Chrome DevTools Protocol endpoint for remote browser or Electron apps |
OPENCLI_CDP_TARGET | — | Filter CDP targets by URL substring (e.g. detail.1688.com) |
OPENCLI_VERBOSE | false | Enable verbose logging (-v flag also works) |
DEBUG_SNAPSHOT | — | Set to 1 for DOM snapshot debug output |
opencli browser * requires an explicit <session> positional, uses a foreground browser window by default, and keeps that session’s tab lease until opencli browser <session> close or idle cleanup. Browser-backed adapters use a background adapter window and release one-shot tab leases by default. Interactive adapters can declare siteSession: 'persistent' to keep a stable site tab for continuity; pass --site-session ephemeral for a one-shot tab.
Built-in Commands
| Site | Commands |
|---|---|
| xiaohongshu | search note comments feed user download publish notifications creator-notes creator-notes-summary creator-note-detail creator-profile creator-stats |
| bilibili | hot search history feed ranking download comments dynamic favorite following me subtitle summary video user-videos |
| zhihu | hot search question download follow like favorite comment answer |
| hackernews | top new best ask show jobs search user |
connect inbox safe-send search sent-invitations thread-snapshot timeline salesnav-search salesnav-inbox salesnav-message salesnav-thread | |
hot frontpage popular search subreddit read user user-posts user-comments upvote upvoted save saved comment subscribe | |
trending search timeline tweets lists list-tweets list-add list-remove bookmarks post download profile article like likes notifications reply reply-dm thread follow unfollow followers following block unblock bookmark unbookmark delete hide-reply accept | |
| claude | ask send new status read history detail |
| gemini | new ask image deep-research deep-research-result |
| notebooklm | status list open current get history summary note-list notes-get source-list source-get source-fulltext source-guide |
| amazon | bestsellers search product offer discussion movers-shakers new-releases rankings |
Curated highlights — → see all 100+ supported sites & commands (douyin / weibo / spotify / 1688 / quark / nowcoder / google-scholar / hupu / xianyu / weread / weread-official / xiaoyuzhou / and more).
CLI Hub
Unified passthrough for your existing command-line tools. Run opencli <tool> ... for any of:
gh · docker · vercel · wrangler · obsidian · longbridge · lark-cli · ntn(notion) · dws(DingTalk Workspace) · wecom-cli(企业微信) · tg(tg-cli) · discord(discord-cli) · wx(wx-cli)
Register your own with opencli external register <name>; list everything with opencli external list.
Desktop app adapters (Electron, via CDP): Cursor / Codex / Antigravity / ChatGPT App / ChatWise / Discord / Doubao — see docs/adapters/desktop/.
Download Support
OpenCLI supports downloading images, videos, and articles from supported platforms.
| Platform | Content Types | Notes |
|---|---|---|
| xiaohongshu | Images, Videos | Downloads all media from a note |
| rednote | Images, Videos | Downloads all media from a signed rednote note URL |
| bilibili | Videos | Requires yt-dlp installed |
| Images, Videos | From user media tab or single tweet | |
| douban | Images | Poster / still image lists |
| pixiv | Images | Original-quality illustrations, multi-page |
| 1688 | Images, Videos | Downloads page-visible product media from item pages |
| xiaoyuzhou | Audio, Transcript | Downloads episode audio and transcript JSON/text with local credentials |
| zhihu | Articles (Markdown) | Exports with optional image download |
| weixin | Articles (Markdown) | WeChat Official Account articles |
For video downloads, install yt-dlp first: brew install yt-dlp
opencli xiaohongshu download "https://www.xiaohongshu.com/search_result/<id>?xsec_token=..." --output ./xhs
opencli xiaohongshu download "https://xhslink.com/..." --output ./xhs
opencli rednote download "https://www.rednote.com/search_result/<id>?xsec_token=..." --output ./rednote
opencli bilibili download BV1xxx --output ./bilibili
opencli twitter download elonmusk --limit 20 --output ./twitter
opencli 1688 download 841141931191 --output ./1688-downloads
opencli xiaoyuzhou download 69b3b675772ac2295bfc01d0 --output ./xiaoyuzhou
opencli xiaoyuzhou transcript 69dd0c98e2c8be31551f6a33 --output ./xiaoyuzhou-transcripts
opencli xiaoyuzhou download and transcript require local Xiaoyuzhou credentials in ~/.opencli/xiaoyuzhou.json.
Output Formats
All built-in commands support --format / -f with table (default), json, yaml, md, and csv.
opencli bilibili hot -f json # Pipe to jq or LLMs
opencli bilibili hot -f csv # Spreadsheet-friendly
opencli bilibili hot -v # Verbose: show pipeline debug steps
Exit Codes
opencli follows Unix sysexits.h so CI / scripts can branch on failure mode: 0 success, 66 empty result, 69 Browser Bridge down, 75 timeout, 77 auth required, 78 config error, 130 Ctrl-C. Full reference: docs/guide/exit-codes.md.
Plugins
Extend OpenCLI with community-contributed adapters:
opencli plugin install github:user/opencli-plugin-my-tool
opencli plugin list
opencli plugin update --all
opencli plugin uninstall my-tool
| Plugin | Type | Description |
|---|---|---|
| opencli-plugin-github-trending | JS | GitHub Trending repositories |
| opencli-plugin-hot-digest | JS | Multi-platform trending aggregator |
| opencli-plugin-juejin | JS | 稀土掘金 (Juejin) hot articles |
| opencli-plugin-vk | JS | VK (VKontakte) wall, feed, and search |
See Plugins Guide for creating your own plugin.
Testing
See TESTING.md for how to run and write tests.
Troubleshooting
- “Extension not connected” — Ensure the Browser Bridge extension is installed from the Chrome Web Store and enabled in
chrome://extensions. - “attach failed: Cannot access a chrome-extension:// URL” — Another extension may be interfering. Try disabling other extensions temporarily.
- Empty data or ‘Unauthorized’ error — Your Chrome/Chromium login session may have expired. Navigate to the target site and log in again.
- Node API errors / missing
fetch/ startup crash on old Node — OpenCLI requires Node.js >= 21. Runnode --version, upgrade Node if needed, then retry. - Daemon issues — Check status:
curl localhost:19825/status· View logs:curl localhost:19825/logs
Star History
License
相似文章
@jakevin7: OpenCLI 2W star 了 ,https://github.com/jackwener/OpenCLI… 不过现在对 star 数已经没什么感觉了。 最开心的时候,还是和用户们交流使用&分享心得经验的时候。 后面还会持续改进,让 …
OpenCLI 项目 GitHub 星标突破 2 万,该工具可将网站、浏览器会话及 Electron 应用转化为确定性 CLI 接口,支持 AI Agent 直接操作浏览器并自动化工作流。
@jakevin7: OpenCLI 又多了个很好用的能力 —— 一条命令把任意网页转成 Markdown。 opencli web read --url <any-url> 浏览器里先用 DOM 启发式抽正文(article → role="main" → …
OpenCLI adds a one-command feature to convert any webpage to Markdown using DOM heuristics and Turndown.
@jakevin7: OpenCLI 一键打通 Agent 的 Twitter 世界! 发推、回复、引用转推、监控通知、关键词触发自动回 DM、把整条 thread 抓下来总结、批量整理 list、定期备份某个人的所有图和视频——你想到的客户端操作,Agent…
OpenCLI 是一个工具,允许 AI Agent 通过复用浏览器登录态(无需 X API key)无缝操作 Twitter,实现发推、回复、监控、抓取总结等 36 种客户端命令,将 Twitter 变成 Agent 的完整生活场景。
@jakevin7: OpenCLI 的APP 今天完全用一个组件库重新重构了,折腾了好久。 MakeAgent 我也准备让 Agent 把 UI 完全重构掉,这个 UI 可能是最后一面了。 https://github.com/jackwener/maka-…
Maka 是一个本地优先的桌面 AI 工作台,基于 Electron 构建,支持多种模型连接、工具调用、权限控制和隐私保护,并集成了机器人入口、本地记忆等功能。项目作者同时提到重构了 OpenCLI 的 APP 和计划重构 MakeAgent 的 UI。
@jakevin7: OpenCLI 现在可以读取微信、Telegram、Discord 的内容了!!! wx-cli / tg-cli / discord-cli 全部接入。 群消息、聊天记录、朋友圈、收藏夹——全部可以用 CLI 直接拿到。 以前 Agen…
OpenCLI 更新支持通过 wx-cli、tg-cli 和 discord-cli 读取微信、Telegram 和 Discord 的内容。该工具允许 AI 代理和开发者通过命令行接口自动化浏览器操作并聚合个人消息流。