@jakevin7: OpenCLI v1.8.0 finally released. Stayed up until 4am, the journey from 1.7 to 1.8 is now complete. Looking back, there's really been a lot done! ## Browser Agent Runtime The browser capability has been fully polished in one go. Essentially, it shifts the "agent controlling the browser" from blindly guessing with selectors to an agent-native approach using "accessibility tree + semantic targeting + CDP native input". CDP input + AX snapshot + semantic targeting (--role / --name / --label / --testid) + hover / focus / dblclick / check / upload / drag / wait download / annotated screenshot. opencli browser <session> click 5 now directly uses references instead of blindly guessing with selectors. Custom dropdowns like Radix / shadcn / Material UI that previously couldn't be clicked are now clickable.

X AI KOLs Following Tools

Summary

OpenCLI v1.8.0 released, enhancing browser agent runtime with more reliable browser control via accessibility tree and CDP native input, adding support for WeRead, 12306, Xianyu, and other sites, improving stability and security.

OpenCLI v1.8.0 finally released Stayed up until 4am, the journey from 1.7 to 1.8 is now complete. Looking back, there's really been a lot done! ## Browser Agent Runtime The browser capability has been fully polished in one go. Essentially, it shifts the "agent controlling the browser" from blindly guessing with selectors to an agent-native approach using "accessibility tree + semantic targeting + CDP native input". CDP input + AX snapshot + semantic targeting (--role / --name / --label / --testid) + hover / focus / dblclick / check / upload / drag / wait download / annotated screenshot. opencli browser <session> click 5 now directly uses references instead of blindly guessing with selectors. Custom dropdowns like Radix / shadcn / Material UI that previously couldn't be clicked are now clickable. ## New Sites / New Scenarios weread-official — Integrated with WeRead official Agent Gateway, pure HTTP + Bearer key, coexists with the original cookie-based weread 12306 — Full retrieval: trains / prices / my orders / frequent passengers xianyu — Xianyu inbox / messages / reply, conversations now work suno — Music generation linkedin — Sales Navigator + people-search + messaging / safe-send / thread-snapshot all integrated linkedin-learning / rednote / booking / ctrip hotels + flights / DuckDuckGo / Brave / Yahoo search ## Twitter continued polishing list-create / device-follow / quoted_tweet / card binding_values / bio / UserMedia cursor pagination all added Write operations symmetric: unlike / retweet / unretweet / quote bookmarks / bookmark-folder / list-tweets now all include media ## Reddit / Zhihu deeper reading reddit: subscribed / whoami / home / subreddit-info / --expand-more expand comment trees / listing exposes post_hint+url+preview+gallery zhihu: answer-detail / answer-comments / answer pagination (Important) ## Stability & Security download path traversal fixed, remote fields (like video titles) can no longer use ../ to break out of output directory Page.goto stale-identity self-healing, CDP error -32000 automatically retry undici 8.x crashed Node 20, rolled back to 6.x to restore all Node 20 users youtube subtitle cross-video pollution fixed (SPA switches videos before fetching previous video's subtitles) ChatGPT web image generation restored (now recognizes CSS background + canvas, not just <img>) typed-error cleanup: Douyin / Jike / WeRead / Apple Podcasts / Reddit / Gitee / lesswrong / xhs / YouTube silent-sentinel and silent-empty all replaced with EmptyResultError / AuthRequiredError, so agents no longer have to guess whether an empty array is truly empty or a site redesign ## Trace & Observation adapter failures now include trace artifacts, summary.md as diagnostics entry browser console / browser network --failed / --follow, agents can finally see what happens inside the browser ## README cut by 20% Chinese 455 → 371, English 410 → 326. Built-in Commands reduced to 11 commonly used sites, CLI Hub trimmed to just name enumeration. Looks less intimidating now. Goal unchanged: read & action infrastructure for AI agents.
Original Article
View Cached Full Text

Cached at: 05/19/26, 10:52 PM

OpenCLI v1.8.0 is finally out. Stayed up until 4 AM, the journey from 1.7 to 1.8 is now complete. Looking back, I’ve done way too much during this period!

Browser Agent Runtime

The browser capabilities have been fully fleshed out, done in one go. Essentially, it shifts “agent controlling the browser” from the blind-guessing approach of selector-string concatenation to an agent-native path of “accessibility tree + semantic targeting + CDP native input.” CDP input + AX snapshot + semantic targeting (--role / --name / --label / --testid) + hover / focus / dblclick / check / upload / drag / wait download / annotated screenshot. opencli browser click 5 now points directly with a ref, no longer relying on selector guessing. Custom dropdowns like Radix, shadcn, Material UI that previously couldn’t be clicked — they all work now.

New Sites / New Scenarios

  • weread-official — Access to WeRead Official Agent Gateway, pure HTTP + Bearer key, coexists with the original cookie-based weread
  • 12306 — Full reading: train number / price / my orders / frequent passengers
  • xianyu — Xianyu inbox / messages / reply, conversations are now reachable
  • suno — Music generation
  • linkedin — Sales Navigator + people-search + internal messages / safe-send / thread-snapshot fully integrated
  • linkedin-learning / rednote / booking / ctrip hotels + flights / DuckDuckGo / Brave / Yahoo search

Twitter Continued Polishing

  • list-create / device-follow / quoted_tweet / card binding_values / bio / UserMedia cursor pagination all completed
  • Write operations symmetric: unlike / retweet / unretweet / quote bookmark / bookmark-folder / list-tweets now all carry media

Reddit / Zhihu Deeper Reading

  • reddit: subscribed / whoami / home / subreddit-info / --expand-more to expand comment threads / listing exposes post_hint+url+preview+gallery
  • zhihu: answer-detail / answer-comments / answer pagination (very important)

Stability & Security

  • Download path traversal fix: remote fields (like video titles) can no longer use ../ to escape the output directory
  • Page.goto stale-identity self-healing: CDP -32000 automatically retries
  • undici 8.x broke Node 20 entirely, rolled back to 6.x to salvage all Node 20 users
  • YouTube subtitle cross-video pollution fixed (SPA now doesn’t pick up the previous video’s subtitles before switching)
  • ChatGPT web image generation restored (now recognizes CSS background + canvas, not just <img>)
  • typed-error cleanup: Douyin / Jike / WeRead / Apple Podcasts / Reddit / Gitee / lesswrong / xhs / YouTube — all silent-sentinel and silent-empty replaced with EmptyResultError / AuthRequiredError. Agents no longer have to guess if an empty array means truly empty or a site redesign

Trace & Observation

  • Adapter failures now come with a trace artifact, summary.md is the diagnostic entry point
  • browser console / browser network --failed / --follow — agents can finally see what’s happening inside the browser

README Cut Down by 20%

  • Chinese: 455 → 371, English: 410 → 326
  • Built-in Commands shrank to 11 commonly used sites, CLI Hub cut to name enumeration. No longer overwhelming to look at.

Goal unchanged: the read & action infrastructure for AI agents.


jackwener/OpenCLI

Source: https://github.com/jackwener/OpenCLI

OpenCLI

Convert any website into a CLI & run Browser Use on your logged-in Chrome. Turn websites, browser sessions, Electron apps, and local tools into deterministic interfaces for humans and AI agents. Or run Browser Use against any page — navigate, fill forms, click, extract, automate.

中文文档

npm (https://www.npmjs.com/package/@jackwener/opencli) Node.js Version (https://nodejs.org) License

OpenCLI gives you one surface for three different kinds of automation:

  • Use built-in adapters for sites like Bilibili, Zhihu, Xiaohongshu, Reddit, HackerNews, Twitter/X, and many more.
  • Let AI Agents operate any website — install the opencli-browser skill in your AI agent (Claude Code, Cursor, etc.), and it can navigate, click, type/fill, extract, and inspect any page through your logged-in browser via opencli browser primitives.
  • Write new adapters end-to-end with opencli browser + the opencli-adapter-author skill, which guides from first recon through field decoding, code, and opencli browser verify.

It also works as a CLI hub for local tools such as gh, docker, longbridge, tg, discord, wx, ntn (Notion), and other binaries you register yourself, plus desktop app adapters for Electron apps like Cursor, Codex, Antigravity, and ChatGPT.

Quick Start

1. Install OpenCLI

OpenCLI requires Node.js >= 21.

node --version
npm install -g @jackwener/opencli

2. Install the Browser Bridge Extension

OpenCLI connects to Chrome/Chromium through a lightweight Browser Bridge extension plus a small local daemon. The daemon auto-starts when needed.

Option A — Chrome Web Store (recommended): Install OpenCLI from the Chrome Web Store (https://chromewebstore.google.com/detail/opencli/ildkmabpimmkaediidaifkhjpohdnifk).

Option B — Manual install:

  1. Download the latest opencli-extension-v{version}.zip from the GitHub Releases page (https://github.com/jackwener/opencli/releases).
  2. Unzip it, open chrome://extensions, and enable Developer mode.
  3. Click Load unpacked and select the unzipped folder.

3. Verify the setup

opencli doctor

4. Optional: name your Chrome profile

Each Chrome profile runs its own OpenCLI extension instance. If you use multiple Chrome profiles, list the connected profiles and assign local aliases:

opencli profile list
opencli profile rename work
opencli profile use work
opencli --profile work browser state

With only one connected profile, OpenCLI uses it automatically. With multiple connected profiles and no default, OpenCLI asks you to choose instead of guessing.

5. Run your first commands

opencli list
opencli hackernews top --limit 5
opencli bilibili hot --limit 5

For Humans

Use OpenCLI directly when you want a reliable command instead of a live browser session:

  • opencli list shows every registered command.
  • opencli <command> runs a built-in or generated adapter.
  • opencli external register mycli exposes a local CLI through the same discovery surface.
  • opencli doctor helps diagnose browser connectivity.

Extending OpenCLI

If you want to add your own commands, start with the Extending OpenCLI guide. README keeps this short; the guide covers the directory layout, source-control model, and install commands.

NeedRecommended path
Keep personal website commands in your own Git repoopencli plugin create + opencli plugin install file://...
Quickly draft a private local adapteropencli browser init <site>/ in ~/.opencli/clis/
Modify an official adapter locallyopencli adapter eject <adapter> + opencli adapter reset <adapter>
Publish or install third-party commandsopencli plugin install github:user/repo
Wrap an existing local binaryopencli external register <binary>

For AI Agents

OpenCLI’s browser commands are designed to be used by AI Agents — not run manually. Install skills into your AI agent (Claude Code, Cursor, etc.), and the agent operates websites on your behalf using your logged-in Chrome session.

Install skills (also refreshes existing installs)

npx skills add jackwener/opencli

Or install only what you need:

npx skills add jackwener/opencli --skill opencli-adapter-author
npx skills add jackwener/opencli --skill opencli-autofix
npx skills add jackwener/opencli --skill opencli-browser
npx skills add jackwener/opencli --skill opencli-usage

Which skill to use

SkillWhen to useExample prompt to your AI agent
opencli-adapter-authorWrite a reusable adapter for a new site or add a command to an existing site“Write an adapter for douyin trending” / “Make a command that grabs the top posts from this page”
opencli-autofixRepair a broken adapter when a built-in command failsopencli zhihu hot is returning empty — fix it”
opencli-browserDrive a real Chrome page ad-hoc — navigate, fill forms, click, extract“Help me check my Xiaohongshu notifications” / “Help me fill out this form” / “Use browser commands to scrape this page”
opencli-usageQuick reference for all OpenCLI commands and sites“What commands does OpenCLI have for Twitter?”

How it works

Once opencli-browser is installed, your AI agent can:

  1. Navigate to any URL using your logged-in browser
  2. Read page content via structured DOM snapshots (not screenshots)
  3. Interact — click buttons, fill forms, select options, press keys
  4. Extract data from the page or intercept network API responses
  5. Wait for elements, text, or page transitions

The agent handles all the opencli browser commands internally — you just describe what you want done in natural language.

Skill references:

Available browser commands include open, state, click, type, fill, select, keys, wait, get, find, extract, frames, screenshot, scroll, back, eval, network, tab list, tab new, tab select, tab close, init, verify, and close.

opencli browser commands require a <profile> positional immediately after browser. opencli browser work open <url> and opencli browser work tab new [url] both return a target ID. Use opencli browser work tab list to inspect target IDs, then pass --tab <id> to route a command to a specific tab. tab new creates a new tab without changing the default browser target; only tab select <id> promotes that tab to the default target for later untargeted commands in the same session.

Writing a new adapter

When the site you need is not yet covered, use the opencli-adapter-author skill end-to-end:

  1. Recon the site and pick a pattern (SPA / SSR / JSONP / Token / Streaming).
  2. Discover the right endpoint — network inspection, initial state, bundle search, token trace, or interceptor fallback.
  3. Pick authPUBLIC / COOKIE / INTERCEPT / UI / LOCAL.
  4. Decode response fields and design output columns.
  5. opencli browser recon analyze <url>opencli browser recon init <site>/ → write adapter → opencli browser recon verify <site>/.
  6. Site knowledge persists to ~/.opencli/sites/<site>/ so the next adapter for the same site starts from context.

Prerequisites

  • Node.js: >= 21.0.0 (required for the standard npm install path)
  • Bun: >= 1.0 (optional alternative runtime)
  • Chrome or Chromium running and logged into the target site for browser-backed commands

Important: Browser-backed commands reuse your Chrome/Chromium login session. If you get empty data or permission-like failures, first confirm the site is already open and authenticated in Chrome/Chromium.

Configuration

VariableDefaultDescription
OPENCLI_DAEMON_PORT19825HTTP port for the daemon-extension bridge
OPENCLI_PROFILEBrowser Bridge profile alias/contextId to use when multiple Chrome profiles are connected
OPENCLI_WINDOWcommand defaultSet to foreground or background to override Browser Bridge window placement. Browser-backed commands also accept --window <mode>.
OPENCLI_BROWSER_CONNECT_TIMEOUT30Seconds to wait for browser connection
OPENCLI_BROWSER_COMMAND_TIMEOUT60Seconds to wait for a single browser command
OPENCLI_CDP_ENDPOINTChrome DevTools Protocol endpoint for remote browser or Electron apps
OPENCLI_CDP_TARGETFilter CDP targets by URL substring (e.g. detail.1688.com)
OPENCLI_VERBOSEfalseEnable verbose logging (-v flag also works)
DEBUG_SNAPSHOTSet to 1 for DOM snapshot debug output

opencli browser * requires an explicit <profile> positional, uses a foreground browser window by default, and keeps that session’s tab lease until opencli browser close or idle cleanup. Browser-backed adapters use a background adapter window and release one-shot tab leases by default. Interactive adapters can declare siteSession: 'persistent' to keep a stable site tab for continuity; pass --site-session ephemeral for a one-shot tab.

Built-in Commands

SiteCommands
xiaohongshusearch note comments feed user download publish notifications creator-notes creator-notes-summary creator-note-detail creator-profile creator-stats
bilibilihot search history feed ranking download comments dynamic favorite following me subtitle summary video user-videos
zhihuhot search question download follow like favorite comment answer
hackernewstop new best ask show jobs search user
linkedinconnect inbox safe-send search sent-invitations thread-snapshot timeline salesnav-search salesnav-inbox salesnav-message salesnav-thread
reddithot frontpage popular search subreddit read user user-posts user-comments upvote upvoted save saved comment subscribe
twittertrending search timeline tweets lists list-tweets list-add list-remove bookmarks post download profile article like likes notifications reply reply-dm thread follow unfollow followers following block unblock bookmark unbookmark delete hide-reply accept
claudeask send new status read history detail
gemininew ask image deep-research deep-research-result
notebooklmstatus list open current get history summary note-list notes-get source-list source-get source-fulltext source-guide
amazonbestsellers search product offer discussion movers-shakers new-releases rankings

Curated highlights — → see all 100+ supported sites & commands (douyin / weibo / spotify / 1688 / quark / nowcoder / google-scholar / hupu / xianyu / weread / weread-official / xiaoyuzhou / and more).

CLI Hub

Unified passthrough for your existing command-line tools. Run opencli <command> ... for any of: gh · docker · vercel · wrangler · obsidian · longbridge · lark-cli · ntn(notion) · dws(DingTalk Workspace) · wecom-cli(企业微信) · tg(tg-cli) · discord(discord-cli) · wx(wx-cli)

Register your own with opencli external register <binary>; list everything with opencli external list.

Desktop app adapters (Electron, via CDP): Cursor / Codex / Antigravity / ChatGPT App / ChatWise / Discord / Doubao — see docs/adapters/desktop/.

Download Support

OpenCLI supports downloading images, videos, and articles from supported platforms.

PlatformContent TypesNotes
xiaohongshuImages, VideosDownloads all media from a note
rednoteImages, VideosDownloads all media from a signed rednote note URL
bilibiliVideosRequires yt-dlp installed
twitterImages, VideosFrom user media tab or single tweet
doubanImagesPoster / still image lists
pixivImagesOriginal-quality illustrations, multi-page
1688Images, VideosDownloads page-visible product media from item pages
xiaoyuzhouAudio, TranscriptDownloads episode audio and transcript JSON/text with local credentials
zhihuArticles (Markdown)Exports with optional image download
weixinArticles (Markdown)WeChat Official Account articles

For video downloads, install yt-dlp first: brew install yt-dlp

opencli xiaohongshu download "https://www.xiaohongshu.com/search_result/?xsec_token=..." --output ./xhs
opencli xiaohongshu download "https://xhslink.com/..." --output ./xhs
opencli rednote download "https://www.rednote.com/search_result/?xsec_token=..." --output ./rednote
opencli bilibili download BV1xxx --output ./bilibili
opencli twitter download elonmusk --limit 20 --output ./twitter
opencli 1688 download 841141931191 --output ./1688-downloads
opencli xiaoyuzhou download 69b3b675772ac2295bfc01d0 --output ./xiaoyuzhou
opencli xiaoyuzhou transcript 69dd0c98e2c8be31551f6a33 --output ./xiaoyuzhou-transcripts

opencli xiaoyuzhou download and transcript require local Xiaoyuzhou credentials in ~/.opencli/xiaoyuzhou.json.

Output Formats

All built-in commands support --format / -f with table (default), json, yaml, md, and csv.

opencli bilibili hot -f json
opencli hackernews top --format yaml
opencli reddit frontpage -f csv

Contributing

See CONTRIBUTING.md.

License

MIT

Similar Articles

@jakevin7: OpenCLI hit 20k stars, https://github.com/jackwener/OpenCLI… But now I don't feel much about the star count anymore. The happiest moments are still when communicating with users about usage & sharing insights. Will continue to improve later, making …

X AI KOLs Following

The OpenCLI project surpassed 20,000 stars on GitHub. This tool converts websites, browser sessions, and Electron applications into deterministic CLI interfaces, supporting AI Agents in directly operating browsers and automating workflows.

@jakevin7: OpenCLI One-Click Unlocks Agent's Twitter World! Tweet, reply, quote retweet, monitor notifications, keyword-triggered auto DM reply, grab entire thread for summary, batch organize lists, regularly back up someone's images and videos — any client operation you can think of, Agent…

X AI KOLs Timeline

OpenCLI is a tool that allows AI Agents to seamlessly operate Twitter by reusing the browser login state (no X API key needed), supporting 36 client commands like tweeting, replying, monitoring, fetching summaries, etc., turning Twitter into a full-fledged living environment for Agents.

@jakevin7: Today, OpenCLI's app was completely rebuilt with a component library, which took a lot of effort. I'm also planning to have the Agent fully refactor the UI of MakeAgent—this might be the final version of the UI. https://github.com/jackwener/maka-…

X AI KOLs Following

Maka is a local-first desktop AI workbench built with Electron, supporting multi-model connections, tool calls, permission control, and privacy protection, along with integrated bot access, local memory, and more. The author also mentioned rebuilding the OpenCLI app and planning to refactor MakeAgent's UI.

@jakevin7: OpenCLI can now read WeChat, Telegram, and Discord content!!! wx-cli, tg-cli, and discord-cli are all integrated. Group messages, chat logs, Moments, and favorites can all be directly accessed via CLI. Previously, Agen…

X AI KOLs Following

OpenCLI has been updated to support reading content from WeChat, Telegram, and Discord via wx-cli, tg-cli, and discord-cli. This tool allows AI agents and developers to automate browser operations through a command-line interface and aggregate personal message feeds.