@jakevin7: OpenCLI v1.8.0 终于发布了 熬到凌晨 4 点,1.7 到 1.8 这一程算是落地了。回顾这段时间确实做了太多了! ## Browser Agent Runtime 完善了 browser 能力,一次性到位。本质是把"agen…

X AI KOLs Following 工具

摘要

OpenCLI v1.8.0 发布,增强了浏览器代理运行时,通过 accessibility tree 和 CDP 原生输入实现了更可靠的浏览器操控,并新增了微信读书、12306、闲鱼等多个站点适配,提升了稳定性和安全性。

OpenCLI v1.8.0 终于发布了 熬到凌晨 4 点,1.7 到 1.8 这一程算是落地了。回顾这段时间确实做了太多了! ## Browser Agent Runtime 完善了 browser 能力,一次性到位。本质是把"agent 操控浏览器"从靠选择器拼字符串的盲摸路线,换成「accessibility tree + 语义定位 + CDP 原生输入」的 agent-native 路线。CDP 输入 + AX 快照 + 语义定位(--role / --name / --label / --testid)+ hover / focus / dblclick / check / upload / drag / wait download / annotated screenshot opencli browser <session> click 5,直接拿 ref 点,不再靠选择器瞎猜。Radix / shadcn / Material UI 这些自定义下拉过去点不开的,现在都点得动。 ## 新站点 / 新场景 weread-official — 接入微信读书官方 Agent Gateway,纯 HTTP + Bearer key,跟原来 cookie 版的 weread 并存 12306 — 全套读取:车次 / 价格 / 我的订单 / 常用乘客 xianyu — 闲鱼 inbox / messages / reply,对话回得动了 suno — 音乐生成 linkedin — Sales Navigator + people-search + 站内信 / safe-send / thread-snapshot 全打通 linkedin-learning / rednote / booking / ctrip 酒店 + 机票 / DuckDuckGo / Brave / Yahoo 搜索 ## Twitter 继续打磨 list-create / device-follow / quoted_tweet / card binding_values / bio / UserMedia 游标翻页全补 写操作对称:unlike / retweet / unretweet / quote bookmarks / bookmark-folder / list-tweets 现在都带媒体 ## Reddit / Zhihu 读得更深 reddit:subscribed / whoami / home / subreddit-info / --expand-more 展评论楼 / listing 暴露 post_hint+url+preview+gallery zhihu:answer-detail / answer-comments / 答案翻页 (很重要) ## 稳定性 & 安全 download 路径穿越修复,远端字段(视频标题这种)再也不能用 ../ 越狱出输出目录 Page.goto stale-identity 自愈,CDP 报 -32000 自动重试 undici 8.x 把 Node 20 顶崩了,回退到 6.x 救回 Node 20 全量用户 youtube 字幕跨视频污染修复(SPA 切视频之前会拿到上一支的字幕) ChatGPT 网页生图回归(现在能识别 CSS background + canvas,不只是 <img>) typed-error 大扫除:Douyin / Jike / WeRead / Apple Podcasts / Reddit / Gitee / lesswrong / xhs / YouTube 的 silent-sentinel 和 silent-empty 全部换成 EmptyResultError / AuthRequiredError,agent 终于不用瞎猜空数组到底是真空还是站点改版 ## Trace & Observation adapter 失败现在带 trace artifact,summary.md 是诊断入口 browser console / browser network --failed / --follow,agent 终于能看见浏览器内部发生了什么 ## README 砍了 20% 中文 455 → 371,英文 410 → 326。Built-in Commands 收到 11 个常用站,CLI Hub 砍成名字枚举。看着不再劝退。 目标没变:AI agent 的 read & action 基建。
查看原文
查看缓存全文

缓存时间: 2026/05/19 22:52

OpenCLI v1.8.0 终于发布了

熬到凌晨 4 点,1.7 到 1.8 这一程算是落地了。回顾这段时间确实做了太多了!

Browser Agent Runtime

完善了 browser 能力,一次性到位。本质是把“agent 操控浏览器“从靠选择器拼字符串的盲摸路线,换成「accessibility tree + 语义定位 + CDP 原生输入」的 agent-native 路线。CDP 输入 + AX 快照 + 语义定位(–role / –name / –label / –testid)+ hover / focus / dblclick / check / upload / drag / wait download / annotated screenshot opencli browser click 5,直接拿 ref 点,不再靠选择器瞎猜。Radix / shadcn / Material UI 这些自定义下拉过去点不开的,现在都点得动。

新站点 / 新场景

weread-official — 接入微信读书官方 Agent Gateway,纯 HTTP + Bearer key,跟原来 cookie 版的 weread 并存 12306 — 全套读取:车次 / 价格 / 我的订单 / 常用乘客 xianyu — 闲鱼 inbox / messages / reply,对话回得动了 suno — 音乐生成 linkedin — Sales Navigator + people-search + 站内信 / safe-send / thread-snapshot 全打通 linkedin-learning / rednote / booking / ctrip 酒店 + 机票 / DuckDuckGo / Brave / Yahoo 搜索

Twitter 继续打磨

list-create / device-follow / quoted_tweet / card binding_values / bio / UserMedia 游标翻页全补 写操作对称:unlike / retweet / unretweet / quote bookmarks / bookmark-folder / list-tweets 现在都带媒体

Reddit / Zhihu 读得更深

reddit:subscribed / whoami / home / subreddit-info / –expand-more 展评论楼 / listing 暴露 post_hint+url+preview+gallery zhihu:answer-detail / answer-comments / 答案翻页 (很重要)

稳定性 & 安全

download 路径穿越修复,远端字段(视频标题这种)再也不能用 ../ 越狱出输出目录 Page.goto stale-identity 自愈,CDP 报 -32000 自动重试 undici 8.x 把 Node 20 顶崩了,回退到 6.x 救回 Node 20 全量用户 youtube 字幕跨视频污染修复(SPA 切视频之前会拿到上一支的字幕) ChatGPT 网页生图回归(现在能识别 CSS background + canvas,不只是 ) typed-error 大扫除:Douyin / Jike / WeRead / Apple Podcasts / Reddit / Gitee / lesswrong / xhs / YouTube 的 silent-sentinel 和 silent-empty 全部换成 EmptyResultError / AuthRequiredError,agent 终于不用瞎猜空数组到底是真空还是站点改版

Trace & Observation

adapter 失败现在带 trace artifact,summary.md 是诊断入口 browser console / browser network –failed / –follow,agent 终于能看见浏览器内部发生了什么

README 砍了 20%

中文 455 → 371,英文 410 → 326。Built-in Commands 收到 11 个常用站,CLI Hub 砍成名字枚举。看着不再劝退。

目标没变:AI agent 的 read & action 基建。


jackwener/OpenCLI

Source: https://github.com/jackwener/OpenCLI

OpenCLI

Convert any website into a CLI & run Browser Use on your logged-in Chrome. Turn websites, browser sessions, Electron apps, and local tools into deterministic interfaces for humans and AI agents. Or run Browser Use against any page — navigate, fill forms, click, extract, automate.

中文文档 npm Node.js Version License

OpenCLI gives you one surface for three different kinds of automation:

  • Use built-in adapters for sites like Bilibili, Zhihu, Xiaohongshu, Reddit, HackerNews, Twitter/X, and many more.
  • Let AI Agents operate any website — install the opencli-browser skill in your AI agent (Claude Code, Cursor, etc.), and it can navigate, click, type/fill, extract, and inspect any page through your logged-in browser via opencli browser primitives.
  • Write new adapters end-to-end with opencli browser + the opencli-adapter-author skill, which guides from first recon through field decoding, code, and opencli browser verify.

It also works as a CLI hub for local tools such as gh, docker, longbridge, tg, discord, wx, ntn (Notion), and other binaries you register yourself, plus desktop app adapters for Electron apps like Cursor, Codex, Antigravity, and ChatGPT.

Quick Start

1. Install OpenCLI

OpenCLI requires Node.js >= 21.

node --version
npm install -g @jackwener/opencli

2. Install the Browser Bridge Extension

OpenCLI connects to Chrome/Chromium through a lightweight Browser Bridge extension plus a small local daemon. The daemon auto-starts when needed.

Option A — Chrome Web Store (recommended): Install OpenCLI from the Chrome Web Store.

Option B — Manual install:

  1. Download the latest opencli-extension-v{version}.zip from the GitHub Releases page.
  2. Unzip it, open chrome://extensions, and enable Developer mode.
  3. Click Load unpacked and select the unzipped folder.

3. Verify the setup

opencli doctor

4. Optional: name your Chrome profile

Each Chrome profile runs its own OpenCLI extension instance. If you use multiple Chrome profiles, list the connected profiles and assign local aliases:

opencli profile list
opencli profile rename <contextId> work
opencli profile use work
opencli --profile work browser state

With only one connected profile, OpenCLI uses it automatically. With multiple connected profiles and no default, OpenCLI asks you to choose instead of guessing.

5. Run your first commands

opencli list
opencli hackernews top --limit 5
opencli bilibili hot --limit 5

For Humans

Use OpenCLI directly when you want a reliable command instead of a live browser session:

  • opencli list shows every registered command.
  • opencli <site> <command> runs a built-in or generated adapter.
  • opencli external register mycli exposes a local CLI through the same discovery surface.
  • opencli doctor helps diagnose browser connectivity.

Extending OpenCLI

If you want to add your own commands, start with the Extending OpenCLI guide. README keeps this short; the guide covers the directory layout, source-control model, and install commands.

NeedRecommended path
Keep personal website commands in your own Git repoopencli plugin create + opencli plugin install file://...
Quickly draft a private local adapteropencli browser init <site>/<command> in ~/.opencli/clis/
Modify an official adapter locallyopencli adapter eject <site> + opencli adapter reset <site>
Publish or install third-party commandsopencli plugin install github:user/repo
Wrap an existing local binaryopencli external register <name>

For AI Agents

OpenCLI’s browser commands are designed to be used by AI Agents — not run manually. Install skills into your AI agent (Claude Code, Cursor, etc.), and the agent operates websites on your behalf using your logged-in Chrome session.

Install skills (also refreshes existing installs)

npx skills add jackwener/opencli

Or install only what you need:

npx skills add jackwener/opencli --skill opencli-adapter-author
npx skills add jackwener/opencli --skill opencli-autofix
npx skills add jackwener/opencli --skill opencli-browser
npx skills add jackwener/opencli --skill opencli-usage

Which skill to use

SkillWhen to useExample prompt to your AI agent
opencli-adapter-authorWrite a reusable adapter for a new site or add a command to an existing site“Write an adapter for douyin trending” / “Make a command that grabs the top posts from this page”
opencli-autofixRepair a broken adapter when a built-in command failsopencli zhihu hot is returning empty — fix it”
opencli-browserDrive a real Chrome page ad-hoc — navigate, fill forms, click, extract“Help me check my Xiaohongshu notifications” / “Help me fill out this form” / “Use browser commands to scrape this page”
opencli-usageQuick reference for all OpenCLI commands and sites“What commands does OpenCLI have for Twitter?”

How it works

Once opencli-browser is installed, your AI agent can:

  1. Navigate to any URL using your logged-in browser
  2. Read page content via structured DOM snapshots (not screenshots)
  3. Interact — click buttons, fill forms, select options, press keys
  4. Extract data from the page or intercept network API responses
  5. Wait for elements, text, or page transitions

The agent handles all the opencli browser commands internally — you just describe what you want done in natural language.

Skill references:

Available browser commands include open, state, click, type, fill, select, keys, wait, get, find, extract, frames, screenshot, scroll, back, eval, network, tab list, tab new, tab select, tab close, init, verify, and close.

opencli browser commands require a <session> positional immediately after browser. opencli browser work open <url> and opencli browser work tab new [url] both return a target ID. Use opencli browser work tab list to inspect target IDs, then pass --tab <targetId> to route a command to a specific tab. tab new creates a new tab without changing the default browser target; only tab select <targetId> promotes that tab to the default target for later untargeted commands in the same session.

Writing a new adapter

When the site you need is not yet covered, use the opencli-adapter-author skill end-to-end:

  1. Recon the site and pick a pattern (SPA / SSR / JSONP / Token / Streaming).
  2. Discover the right endpoint — network inspection, initial state, bundle search, token trace, or interceptor fallback.
  3. Pick authPUBLIC / COOKIE / INTERCEPT / UI / LOCAL.
  4. Decode response fields and design output columns.
  5. opencli browser recon analyze <url>opencli browser recon init <site>/<name> → write adapter → opencli browser recon verify <site>/<name>.
  6. Site knowledge persists to ~/.opencli/sites/<site>/ so the next adapter for the same site starts from context.

Prerequisites

  • Node.js: >= 21.0.0 (required for the standard npm install path)
  • Bun: >= 1.0 (optional alternative runtime)
  • Chrome or Chromium running and logged into the target site for browser-backed commands

Important: Browser-backed commands reuse your Chrome/Chromium login session. If you get empty data or permission-like failures, first confirm the site is already open and authenticated in Chrome/Chromium.

Configuration

VariableDefaultDescription
OPENCLI_DAEMON_PORT19825HTTP port for the daemon-extension bridge
OPENCLI_PROFILEBrowser Bridge profile alias/contextId to use when multiple Chrome profiles are connected
OPENCLI_WINDOWcommand defaultSet to foreground or background to override Browser Bridge window placement. Browser-backed commands also accept --window <foreground|background>.
OPENCLI_BROWSER_CONNECT_TIMEOUT30Seconds to wait for browser connection
OPENCLI_BROWSER_COMMAND_TIMEOUT60Seconds to wait for a single browser command
OPENCLI_CDP_ENDPOINTChrome DevTools Protocol endpoint for remote browser or Electron apps
OPENCLI_CDP_TARGETFilter CDP targets by URL substring (e.g. detail.1688.com)
OPENCLI_VERBOSEfalseEnable verbose logging (-v flag also works)
DEBUG_SNAPSHOTSet to 1 for DOM snapshot debug output

opencli browser * requires an explicit <session> positional, uses a foreground browser window by default, and keeps that session’s tab lease until opencli browser <session> close or idle cleanup. Browser-backed adapters use a background adapter window and release one-shot tab leases by default. Interactive adapters can declare siteSession: 'persistent' to keep a stable site tab for continuity; pass --site-session ephemeral for a one-shot tab.

Built-in Commands

SiteCommands
xiaohongshusearch note comments feed user download publish notifications creator-notes creator-notes-summary creator-note-detail creator-profile creator-stats
bilibilihot search history feed ranking download comments dynamic favorite following me subtitle summary video user-videos
zhihuhot search question download follow like favorite comment answer
hackernewstop new best ask show jobs search user
linkedinconnect inbox safe-send search sent-invitations thread-snapshot timeline salesnav-search salesnav-inbox salesnav-message salesnav-thread
reddithot frontpage popular search subreddit read user user-posts user-comments upvote upvoted save saved comment subscribe
twittertrending search timeline tweets lists list-tweets list-add list-remove bookmarks post download profile article like likes notifications reply reply-dm thread follow unfollow followers following block unblock bookmark unbookmark delete hide-reply accept
claudeask send new status read history detail
gemininew ask image deep-research deep-research-result
notebooklmstatus list open current get history summary note-list notes-get source-list source-get source-fulltext source-guide
amazonbestsellers search product offer discussion movers-shakers new-releases rankings

Curated highlights — → see all 100+ supported sites & commands (douyin / weibo / spotify / 1688 / quark / nowcoder / google-scholar / hupu / xianyu / weread / weread-official / xiaoyuzhou / and more).

CLI Hub

Unified passthrough for your existing command-line tools. Run opencli <tool> ... for any of:

gh · docker · vercel · wrangler · obsidian · longbridge · lark-cli · ntn(notion) · dws(DingTalk Workspace) · wecom-cli(企业微信) · tg(tg-cli) · discord(discord-cli) · wx(wx-cli)

Register your own with opencli external register <name>; list everything with opencli external list.

Desktop app adapters (Electron, via CDP): Cursor / Codex / Antigravity / ChatGPT App / ChatWise / Discord / Doubao — see docs/adapters/desktop/.

Download Support

OpenCLI supports downloading images, videos, and articles from supported platforms.

PlatformContent TypesNotes
xiaohongshuImages, VideosDownloads all media from a note
rednoteImages, VideosDownloads all media from a signed rednote note URL
bilibiliVideosRequires yt-dlp installed
twitterImages, VideosFrom user media tab or single tweet
doubanImagesPoster / still image lists
pixivImagesOriginal-quality illustrations, multi-page
1688Images, VideosDownloads page-visible product media from item pages
xiaoyuzhouAudio, TranscriptDownloads episode audio and transcript JSON/text with local credentials
zhihuArticles (Markdown)Exports with optional image download
weixinArticles (Markdown)WeChat Official Account articles

For video downloads, install yt-dlp first: brew install yt-dlp

opencli xiaohongshu download "https://www.xiaohongshu.com/search_result/<id>?xsec_token=..." --output ./xhs
opencli xiaohongshu download "https://xhslink.com/..." --output ./xhs
opencli rednote download "https://www.rednote.com/search_result/<id>?xsec_token=..." --output ./rednote
opencli bilibili download BV1xxx --output ./bilibili
opencli twitter download elonmusk --limit 20 --output ./twitter
opencli 1688 download 841141931191 --output ./1688-downloads
opencli xiaoyuzhou download 69b3b675772ac2295bfc01d0 --output ./xiaoyuzhou
opencli xiaoyuzhou transcript 69dd0c98e2c8be31551f6a33 --output ./xiaoyuzhou-transcripts

opencli xiaoyuzhou download and transcript require local Xiaoyuzhou credentials in ~/.opencli/xiaoyuzhou.json.

Output Formats

All built-in commands support --format / -f with table (default), json, yaml, md, and csv.

opencli bilibili hot -f json    # Pipe to jq or LLMs
opencli bilibili hot -f csv     # Spreadsheet-friendly
opencli bilibili hot -v         # Verbose: show pipeline debug steps

Exit Codes

opencli follows Unix sysexits.h so CI / scripts can branch on failure mode: 0 success, 66 empty result, 69 Browser Bridge down, 75 timeout, 77 auth required, 78 config error, 130 Ctrl-C. Full reference: docs/guide/exit-codes.md.

Plugins

Extend OpenCLI with community-contributed adapters:

opencli plugin install github:user/opencli-plugin-my-tool
opencli plugin list
opencli plugin update --all
opencli plugin uninstall my-tool
PluginTypeDescription
opencli-plugin-github-trendingJSGitHub Trending repositories
opencli-plugin-hot-digestJSMulti-platform trending aggregator
opencli-plugin-juejinJS稀土掘金 (Juejin) hot articles
opencli-plugin-vkJSVK (VKontakte) wall, feed, and search

See Plugins Guide for creating your own plugin.

Testing

See TESTING.md for how to run and write tests.

Troubleshooting

  • “Extension not connected” — Ensure the Browser Bridge extension is installed from the Chrome Web Store and enabled in chrome://extensions.
  • “attach failed: Cannot access a chrome-extension:// URL” — Another extension may be interfering. Try disabling other extensions temporarily.
  • Empty data or ‘Unauthorized’ error — Your Chrome/Chromium login session may have expired. Navigate to the target site and log in again.
  • Node API errors / missing fetch / startup crash on old Node — OpenCLI requires Node.js >= 21. Run node --version, upgrade Node if needed, then retry.
  • Daemon issues — Check status: curl localhost:19825/status · View logs: curl localhost:19825/logs

Star History

Star History Chart

License

Apache-2.0

相似文章

@jakevin7: OpenCLI 一键打通 Agent 的 Twitter 世界! 发推、回复、引用转推、监控通知、关键词触发自动回 DM、把整条 thread 抓下来总结、批量整理 list、定期备份某个人的所有图和视频——你想到的客户端操作,Agent…

X AI KOLs Timeline

OpenCLI 是一个工具,允许 AI Agent 通过复用浏览器登录态(无需 X API key)无缝操作 Twitter,实现发推、回复、监控、抓取总结等 36 种客户端命令,将 Twitter 变成 Agent 的完整生活场景。

@jakevin7: OpenCLI 的APP 今天完全用一个组件库重新重构了,折腾了好久。 MakeAgent 我也准备让 Agent 把 UI 完全重构掉,这个 UI 可能是最后一面了。 https://github.com/jackwener/maka-…

X AI KOLs Following

Maka 是一个本地优先的桌面 AI 工作台,基于 Electron 构建,支持多种模型连接、工具调用、权限控制和隐私保护,并集成了机器人入口、本地记忆等功能。项目作者同时提到重构了 OpenCLI 的 APP 和计划重构 MakeAgent 的 UI。