@jakevin7: OpenCLI v1.8.0 finally released. Stayed up until 4am, the journey from 1.7 to 1.8 is now complete. Looking back, there's really been a lot done! ## Browser Agent Runtime The browser capability has been fully polished in one go. Essentially, it shifts the "agent controlling the browser" from blindly guessing with selectors to an agent-native approach using "accessibility tree + semantic targeting + CDP native input". CDP input + AX snapshot + semantic targeting (--role / --name / --label / --testid) + hover / focus / dblclick / check / upload / drag / wait download / annotated screenshot. opencli browser <session> click 5 now directly uses references instead of blindly guessing with selectors. Custom dropdowns like Radix / shadcn / Material UI that previously couldn't be clicked are now clickable.
Summary
OpenCLI v1.8.0 released, enhancing browser agent runtime with more reliable browser control via accessibility tree and CDP native input, adding support for WeRead, 12306, Xianyu, and other sites, improving stability and security.
View Cached Full Text
Cached at: 05/19/26, 10:52 PM
OpenCLI v1.8.0 is finally out. Stayed up until 4 AM, the journey from 1.7 to 1.8 is now complete. Looking back, I’ve done way too much during this period!
Browser Agent Runtime
The browser capabilities have been fully fleshed out, done in one go. Essentially, it shifts “agent controlling the browser” from the blind-guessing approach of selector-string concatenation to an agent-native path of “accessibility tree + semantic targeting + CDP native input.” CDP input + AX snapshot + semantic targeting (--role / --name / --label / --testid) + hover / focus / dblclick / check / upload / drag / wait download / annotated screenshot. opencli browser click 5 now points directly with a ref, no longer relying on selector guessing. Custom dropdowns like Radix, shadcn, Material UI that previously couldn’t be clicked — they all work now.
New Sites / New Scenarios
- weread-official — Access to WeRead Official Agent Gateway, pure HTTP + Bearer key, coexists with the original cookie-based weread
- 12306 — Full reading: train number / price / my orders / frequent passengers
- xianyu — Xianyu inbox / messages / reply, conversations are now reachable
- suno — Music generation
- linkedin — Sales Navigator + people-search + internal messages / safe-send / thread-snapshot fully integrated
- linkedin-learning / rednote / booking / ctrip hotels + flights / DuckDuckGo / Brave / Yahoo search
Twitter Continued Polishing
- list-create / device-follow / quoted_tweet / card binding_values / bio / UserMedia cursor pagination all completed
- Write operations symmetric: unlike / retweet / unretweet / quote bookmark / bookmark-folder / list-tweets now all carry media
Reddit / Zhihu Deeper Reading
- reddit: subscribed / whoami / home / subreddit-info /
--expand-moreto expand comment threads / listing exposes post_hint+url+preview+gallery - zhihu: answer-detail / answer-comments / answer pagination (very important)
Stability & Security
- Download path traversal fix: remote fields (like video titles) can no longer use
../to escape the output directory - Page.goto stale-identity self-healing: CDP -32000 automatically retries
- undici 8.x broke Node 20 entirely, rolled back to 6.x to salvage all Node 20 users
- YouTube subtitle cross-video pollution fixed (SPA now doesn’t pick up the previous video’s subtitles before switching)
- ChatGPT web image generation restored (now recognizes CSS background + canvas, not just
<img>) - typed-error cleanup: Douyin / Jike / WeRead / Apple Podcasts / Reddit / Gitee / lesswrong / xhs / YouTube — all silent-sentinel and silent-empty replaced with
EmptyResultError/AuthRequiredError. Agents no longer have to guess if an empty array means truly empty or a site redesign
Trace & Observation
- Adapter failures now come with a trace artifact,
summary.mdis the diagnostic entry point - browser console / browser network
--failed/--follow— agents can finally see what’s happening inside the browser
README Cut Down by 20%
- Chinese: 455 → 371, English: 410 → 326
- Built-in Commands shrank to 11 commonly used sites, CLI Hub cut to name enumeration. No longer overwhelming to look at.
Goal unchanged: the read & action infrastructure for AI agents.
jackwener/OpenCLI
Source: https://github.com/jackwener/OpenCLI
OpenCLI
Convert any website into a CLI & run Browser Use on your logged-in Chrome. Turn websites, browser sessions, Electron apps, and local tools into deterministic interfaces for humans and AI agents. Or run Browser Use against any page — navigate, fill forms, click, extract, automate.
npm (https://www.npmjs.com/package/@jackwener/opencli) Node.js Version (https://nodejs.org) License
OpenCLI gives you one surface for three different kinds of automation:
- Use built-in adapters for sites like Bilibili, Zhihu, Xiaohongshu, Reddit, HackerNews, Twitter/X, and many more.
- Let AI Agents operate any website — install the
opencli-browserskill in your AI agent (Claude Code, Cursor, etc.), and it can navigate, click, type/fill, extract, and inspect any page through your logged-in browser viaopencli browserprimitives. - Write new adapters end-to-end with
opencli browser+ theopencli-adapter-authorskill, which guides from first recon through field decoding, code, andopencli browser verify.
It also works as a CLI hub for local tools such as gh, docker, longbridge, tg, discord, wx, ntn (Notion), and other binaries you register yourself, plus desktop app adapters for Electron apps like Cursor, Codex, Antigravity, and ChatGPT.
Quick Start
1. Install OpenCLI
OpenCLI requires Node.js >= 21.
node --version
npm install -g @jackwener/opencli
2. Install the Browser Bridge Extension
OpenCLI connects to Chrome/Chromium through a lightweight Browser Bridge extension plus a small local daemon. The daemon auto-starts when needed.
Option A — Chrome Web Store (recommended): Install OpenCLI from the Chrome Web Store (https://chromewebstore.google.com/detail/opencli/ildkmabpimmkaediidaifkhjpohdnifk).
Option B — Manual install:
- Download the latest
opencli-extension-v{version}.zipfrom the GitHub Releases page (https://github.com/jackwener/opencli/releases). - Unzip it, open
chrome://extensions, and enable Developer mode. - Click Load unpacked and select the unzipped folder.
3. Verify the setup
opencli doctor
4. Optional: name your Chrome profile
Each Chrome profile runs its own OpenCLI extension instance. If you use multiple Chrome profiles, list the connected profiles and assign local aliases:
opencli profile list
opencli profile rename work
opencli profile use work
opencli --profile work browser state
With only one connected profile, OpenCLI uses it automatically. With multiple connected profiles and no default, OpenCLI asks you to choose instead of guessing.
5. Run your first commands
opencli list
opencli hackernews top --limit 5
opencli bilibili hot --limit 5
For Humans
Use OpenCLI directly when you want a reliable command instead of a live browser session:
opencli listshows every registered command.opencli <command>runs a built-in or generated adapter.opencli external register mycliexposes a local CLI through the same discovery surface.opencli doctorhelps diagnose browser connectivity.
Extending OpenCLI
If you want to add your own commands, start with the Extending OpenCLI guide. README keeps this short; the guide covers the directory layout, source-control model, and install commands.
| Need | Recommended path |
|---|---|
| Keep personal website commands in your own Git repo | opencli plugin create + opencli plugin install file://... |
| Quickly draft a private local adapter | opencli browser init <site>/ in ~/.opencli/clis/ |
| Modify an official adapter locally | opencli adapter eject <adapter> + opencli adapter reset <adapter> |
| Publish or install third-party commands | opencli plugin install github:user/repo |
| Wrap an existing local binary | opencli external register <binary> |
For AI Agents
OpenCLI’s browser commands are designed to be used by AI Agents — not run manually. Install skills into your AI agent (Claude Code, Cursor, etc.), and the agent operates websites on your behalf using your logged-in Chrome session.
Install skills (also refreshes existing installs)
npx skills add jackwener/opencli
Or install only what you need:
npx skills add jackwener/opencli --skill opencli-adapter-author
npx skills add jackwener/opencli --skill opencli-autofix
npx skills add jackwener/opencli --skill opencli-browser
npx skills add jackwener/opencli --skill opencli-usage
Which skill to use
| Skill | When to use | Example prompt to your AI agent |
|---|---|---|
| opencli-adapter-author | Write a reusable adapter for a new site or add a command to an existing site | “Write an adapter for douyin trending” / “Make a command that grabs the top posts from this page” |
| opencli-autofix | Repair a broken adapter when a built-in command fails | “opencli zhihu hot is returning empty — fix it” |
| opencli-browser | Drive a real Chrome page ad-hoc — navigate, fill forms, click, extract | “Help me check my Xiaohongshu notifications” / “Help me fill out this form” / “Use browser commands to scrape this page” |
| opencli-usage | Quick reference for all OpenCLI commands and sites | “What commands does OpenCLI have for Twitter?” |
How it works
Once opencli-browser is installed, your AI agent can:
- Navigate to any URL using your logged-in browser
- Read page content via structured DOM snapshots (not screenshots)
- Interact — click buttons, fill forms, select options, press keys
- Extract data from the page or intercept network API responses
- Wait for elements, text, or page transitions
The agent handles all the opencli browser commands internally — you just describe what you want done in natural language.
Skill references:
skills/opencli-browser/SKILL.md— drive Chrome ad-hoc (navigate, fill forms, click, extract)skills/opencli-adapter-author/SKILL.md— write a new adapter end-to-endskills/opencli-autofix/SKILL.md— repair broken adaptersskills/opencli-usage/SKILL.md— command and site reference
Available browser commands include open, state, click, type, fill, select, keys, wait, get, find, extract, frames, screenshot, scroll, back, eval, network, tab list, tab new, tab select, tab close, init, verify, and close.
opencli browser commands require a <profile> positional immediately after browser. opencli browser work open <url> and opencli browser work tab new [url] both return a target ID. Use opencli browser work tab list to inspect target IDs, then pass --tab <id> to route a command to a specific tab. tab new creates a new tab without changing the default browser target; only tab select <id> promotes that tab to the default target for later untargeted commands in the same session.
Writing a new adapter
When the site you need is not yet covered, use the opencli-adapter-author skill end-to-end:
- Recon the site and pick a pattern (SPA / SSR / JSONP / Token / Streaming).
- Discover the right endpoint — network inspection, initial state, bundle search, token trace, or interceptor fallback.
- Pick auth —
PUBLIC/COOKIE/INTERCEPT/UI/LOCAL. - Decode response fields and design output columns.
opencli browser recon analyze <url>→opencli browser recon init <site>/→ write adapter →opencli browser recon verify <site>/.- Site knowledge persists to
~/.opencli/sites/<site>/so the next adapter for the same site starts from context.
Prerequisites
- Node.js: >= 21.0.0 (required for the standard npm install path)
- Bun: >= 1.0 (optional alternative runtime)
- Chrome or Chromium running and logged into the target site for browser-backed commands
Important: Browser-backed commands reuse your Chrome/Chromium login session. If you get empty data or permission-like failures, first confirm the site is already open and authenticated in Chrome/Chromium.
Configuration
| Variable | Default | Description |
|---|---|---|
OPENCLI_DAEMON_PORT | 19825 | HTTP port for the daemon-extension bridge |
OPENCLI_PROFILE | — | Browser Bridge profile alias/contextId to use when multiple Chrome profiles are connected |
OPENCLI_WINDOW | command default | Set to foreground or background to override Browser Bridge window placement. Browser-backed commands also accept --window <mode>. |
OPENCLI_BROWSER_CONNECT_TIMEOUT | 30 | Seconds to wait for browser connection |
OPENCLI_BROWSER_COMMAND_TIMEOUT | 60 | Seconds to wait for a single browser command |
OPENCLI_CDP_ENDPOINT | — | Chrome DevTools Protocol endpoint for remote browser or Electron apps |
OPENCLI_CDP_TARGET | — | Filter CDP targets by URL substring (e.g. detail.1688.com) |
OPENCLI_VERBOSE | false | Enable verbose logging (-v flag also works) |
DEBUG_SNAPSHOT | — | Set to 1 for DOM snapshot debug output |
opencli browser * requires an explicit <profile> positional, uses a foreground browser window by default, and keeps that session’s tab lease until opencli browser close or idle cleanup. Browser-backed adapters use a background adapter window and release one-shot tab leases by default. Interactive adapters can declare siteSession: 'persistent' to keep a stable site tab for continuity; pass --site-session ephemeral for a one-shot tab.
Built-in Commands
| Site | Commands |
|---|---|
| xiaohongshu | search note comments feed user download publish notifications creator-notes creator-notes-summary creator-note-detail creator-profile creator-stats |
| bilibili | hot search history feed ranking download comments dynamic favorite following me subtitle summary video user-videos |
| zhihu | hot search question download follow like favorite comment answer |
| hackernews | top new best ask show jobs search user |
connect inbox safe-send search sent-invitations thread-snapshot timeline salesnav-search salesnav-inbox salesnav-message salesnav-thread | |
hot frontpage popular search subreddit read user user-posts user-comments upvote upvoted save saved comment subscribe | |
trending search timeline tweets lists list-tweets list-add list-remove bookmarks post download profile article like likes notifications reply reply-dm thread follow unfollow followers following block unblock bookmark unbookmark delete hide-reply accept | |
| claude | ask send new status read history detail |
| gemini | new ask image deep-research deep-research-result |
| notebooklm | status list open current get history summary note-list notes-get source-list source-get source-fulltext source-guide |
| amazon | bestsellers search product offer discussion movers-shakers new-releases rankings |
Curated highlights — → see all 100+ supported sites & commands (douyin / weibo / spotify / 1688 / quark / nowcoder / google-scholar / hupu / xianyu / weread / weread-official / xiaoyuzhou / and more).
CLI Hub
Unified passthrough for your existing command-line tools. Run opencli <command> ... for any of:
gh · docker · vercel · wrangler · obsidian · longbridge · lark-cli · ntn(notion) · dws(DingTalk Workspace) · wecom-cli(企业微信) · tg(tg-cli) · discord(discord-cli) · wx(wx-cli)
Register your own with opencli external register <binary>; list everything with opencli external list.
Desktop app adapters (Electron, via CDP): Cursor / Codex / Antigravity / ChatGPT App / ChatWise / Discord / Doubao — see docs/adapters/desktop/.
Download Support
OpenCLI supports downloading images, videos, and articles from supported platforms.
| Platform | Content Types | Notes |
|---|---|---|
| xiaohongshu | Images, Videos | Downloads all media from a note |
| rednote | Images, Videos | Downloads all media from a signed rednote note URL |
| bilibili | Videos | Requires yt-dlp installed |
| Images, Videos | From user media tab or single tweet | |
| douban | Images | Poster / still image lists |
| pixiv | Images | Original-quality illustrations, multi-page |
| 1688 | Images, Videos | Downloads page-visible product media from item pages |
| xiaoyuzhou | Audio, Transcript | Downloads episode audio and transcript JSON/text with local credentials |
| zhihu | Articles (Markdown) | Exports with optional image download |
| weixin | Articles (Markdown) | WeChat Official Account articles |
For video downloads, install yt-dlp first: brew install yt-dlp
opencli xiaohongshu download "https://www.xiaohongshu.com/search_result/?xsec_token=..." --output ./xhs
opencli xiaohongshu download "https://xhslink.com/..." --output ./xhs
opencli rednote download "https://www.rednote.com/search_result/?xsec_token=..." --output ./rednote
opencli bilibili download BV1xxx --output ./bilibili
opencli twitter download elonmusk --limit 20 --output ./twitter
opencli 1688 download 841141931191 --output ./1688-downloads
opencli xiaoyuzhou download 69b3b675772ac2295bfc01d0 --output ./xiaoyuzhou
opencli xiaoyuzhou transcript 69dd0c98e2c8be31551f6a33 --output ./xiaoyuzhou-transcripts
opencli xiaoyuzhou download and transcript require local Xiaoyuzhou credentials in ~/.opencli/xiaoyuzhou.json.
Output Formats
All built-in commands support --format / -f with table (default), json, yaml, md, and csv.
opencli bilibili hot -f json
opencli hackernews top --format yaml
opencli reddit frontpage -f csv
Contributing
See CONTRIBUTING.md.
License
Similar Articles
@jakevin7: OpenCLI hit 20k stars, https://github.com/jackwener/OpenCLI… But now I don't feel much about the star count anymore. The happiest moments are still when communicating with users about usage & sharing insights. Will continue to improve later, making …
The OpenCLI project surpassed 20,000 stars on GitHub. This tool converts websites, browser sessions, and Electron applications into deterministic CLI interfaces, supporting AI Agents in directly operating browsers and automating workflows.
@jakevin7: OpenCLI now has a killer feature—turn any webpage into Markdown with one command: opencli web read --url <any-url>
OpenCLI adds a one-command feature that converts any webpage to Markdown via DOM heuristics and Turndown.
@jakevin7: OpenCLI One-Click Unlocks Agent's Twitter World! Tweet, reply, quote retweet, monitor notifications, keyword-triggered auto DM reply, grab entire thread for summary, batch organize lists, regularly back up someone's images and videos — any client operation you can think of, Agent…
OpenCLI is a tool that allows AI Agents to seamlessly operate Twitter by reusing the browser login state (no X API key needed), supporting 36 client commands like tweeting, replying, monitoring, fetching summaries, etc., turning Twitter into a full-fledged living environment for Agents.
@jakevin7: Today, OpenCLI's app was completely rebuilt with a component library, which took a lot of effort. I'm also planning to have the Agent fully refactor the UI of MakeAgent—this might be the final version of the UI. https://github.com/jackwener/maka-…
Maka is a local-first desktop AI workbench built with Electron, supporting multi-model connections, tool calls, permission control, and privacy protection, along with integrated bot access, local memory, and more. The author also mentioned rebuilding the OpenCLI app and planning to refactor MakeAgent's UI.
@jakevin7: OpenCLI can now read WeChat, Telegram, and Discord content!!! wx-cli, tg-cli, and discord-cli are all integrated. Group messages, chat logs, Moments, and favorites can all be directly accessed via CLI. Previously, Agen…
OpenCLI has been updated to support reading content from WeChat, Telegram, and Discord via wx-cli, tg-cli, and discord-cli. This tool allows AI agents and developers to automate browser operations through a command-line interface and aggregate personal message feeds.