@dotey: baoyu-design skill 可以在本地生成动画视频,导出 mp4 格式,就像附件视频这个。 Claude Design 网页版可以用提示词创建动画视频,但你需要在网页上才能看到视频,无法下载。也有人基于第三方插件录屏软件实现过,…
摘要
baoyu-design 是一个开源项目,将 Claude Design 打包为本地 Agent Skill,支持在 Cursor、Claude Code 等环境中生成动画视频并直接导出为 MP4 格式,通过无头浏览器逐帧渲染实现精确、高质量的导出。
查看缓存全文
缓存时间: 2026/06/17 01:45
baoyu-design skill 可以在本地生成动画视频,导出 mp4 格式,就像附件视频这个。
Claude Design 网页版可以用提示词创建动画视频,但你需要在网页上才能看到视频,无法下载。也有人基于第三方插件录屏软件实现过,不过很麻烦,效果也不是很好。
在这套 Skill 基础上,我已经实现了直接把动画导出成 mp4 视频的能力。
生成动画和导出视频,其实原理差不多,就是每一帧画面,都是根据所在时间坐标算出来的,就像一个函数 f(t),你传入给动画引擎任意一个时间点 t,它能直接算出那一瞬间屏幕上每个元素的位置、透明度、大小,所有视觉状态完全由 t 决定。不需要从头播放到那个时刻,也不需要记住之前发生了什么。
传统写动画的方式是命令式的:到了某个时刻,去改某个元素的位置,把透明度调一下。状态散落在各处,时间一乱画面就乱。这套引擎反过来,用的是声明式思路:你不去驱动元素运动,只描述在第 t 秒,每个元素应该长什么样。
打个比方,传统动画像看电影,你必须从头看才知道第 30 分钟画面是什么。这套引擎更像一本特殊的书,翻到任意一页,画面都是完整的、确定的。
这个设计带来了三个能力:
- 拖动播放条跳到任意位置可以,因为 f(t) 随时能算;
- 反复调试同一个画面可以,因为同一个 t 永远产出同一帧;
- 把动画导出成视频也可以,而且方式很巧妙。
那么导出视频是怎么实现的呢?
直觉上,把浏览器里的动画变成 MP4,录个屏不就行了?
录屏是实时的。机器稍微一卡就掉帧,拍出来的视频不可复现。而且播放条、黑色背景、圆角阴影这些“播放器外壳“会一起被录进去。
baoyu-design 用的方法更像定格动画的拍摄:启动一个无头浏览器(没有界面的 Chromium),加载动画页面,通过引擎预留的一个控制接口精确操控时间轴。每设定一个时间点,等浏览器把画面渲染完成,截一张图,通过管道直接喂给 ffmpeg 编码。一段 95 秒、30fps 的动画就是 2850 次“摆好时间,拍照“的循环。慢,但每一帧都是精确的,绝不掉帧。
这里有个容易忽略的细节:设定时间后,工具会等两帧 requestAnimationFrame 再截图。因为修改时间只是改了 React 状态,浏览器还需要一到两帧才能把新画面真正画到屏幕上。等少了,截到的可能是上一帧的残影。
为了让画面更锐利,截图时用 2 倍设备像素比渲染,实际按 3840×2160 出图,最后由 ffmpeg 缩回 1080p。原理和高分辨率印刷一样:先在更大的画布上精细绘制,再高质量缩小,文字边缘和细线会明显更清晰。
baoyu-design 在 GitHub 开源(MIT 协议),目前 1.2K star。感兴趣的可以去看看它的 skills/baoyu-design 目录,动画引擎的完整实现都在里面。
https://github.com/jimliu/baoyu-design…
jimliu/baoyu-design
Source: https://github.com/jimliu/baoyu-design
baoyu-design
Run Claude Design on your own local agent — Cursor, Claude Code, Claude Desktop, or any file‑capable coding agent.
baoyu-design packages Claude Design — the design engine behind claude.ai/design — as a portable Agent Skill. Drop it into a local agent and you get most of what the website does, right inside your editor: polished UI mockups, interactive prototypes, wireframes, landing pages, dashboards, mobile apps, and slide decks — all produced as self‑contained HTML.
No website, no separate subscription, no upload step. The agent already on your machine does the work, and every artifact stays in your repo.
Screenshots
The same Reader Mac App prompt was used in Cursor, Codex, Claude, and Claude Design.
| Cursor | Codex | Claude | Claude Design |
|---|---|---|---|
![]() | ![]() | ![]() | ![]() |
Prompt used for all screenshots
Build a Reader Mac app that helps me read and save articles better. All data is stored locally.
## Information collection
1. Manual adding
Support manually adding different types of information:
- URL: enter a URL and automatically fetch content and images
- Attachments: upload PDFs, videos, and images
- Markdown editing: like publishing a blog post, enter the title, body, and cover image
- Other
2. Automatic subscriptions
- RSS feeds
- Social media accounts: X, Weibo, YouTube
- Other
## Editing and organization
1. Tags
Every item can have tags.
2. Categories and folders
Create tree-structured folders and place content in different categories.
3. Favorites
Users can click to favorite an item.
4. Editing
Every item can be edited with a built-in Markdown editor.
## AI assistance
1. Automatic translation
Support translation across different languages.
2. Summaries and abstracts
Generate summaries for captured content.
3. Derivative creation
Create new work based on one or more pieces of content.
4. Integrated AI Chat
Use AI Chat to call AI Agents that help process content.
Why run it locally
- Free yourself from the website. You get the vast majority of
claude.ai/design’s capabilities without ever leaving your editor — same methodology, same craft standards, same output format. - Best with Opus 4.8. The skill is a long, demanding design brief; the stronger the model, the better the result. Pair it with Claude Opus 4.8 for the best output, and it still works well on other capable models.
- Iterate by pointing, not describing. Because the deliverable is plain HTML served on
localhost, you can lean on your agent’s built‑in browser preview and element‑annotation tools (Cursor Browser / DevTools, Claude Preview, or Codex Browser). Point at a button in the live preview, say what you want changed, and the agent edits the underlying source — a tight, visual second‑pass editing loop that’s hard to get on a website. - Everything is yours. Output lands in
designs/<project>/as self‑contained HTML you can version, fork, export, or ship.
What it can make
The skill drives a full design process — clarifying questions → gathering design context → producing one or more HTML deliverables → previewing and verifying. It ships a deep bench of built‑in skills and a set of ready‑made component scaffolds.
| Area | Built‑in skills |
|---|---|
| Core design | Hi‑fi design · Interactive prototype · Wireframe · Frontend aesthetic direction |
| Decks | Make a deck · Speaker notes |
| Mobile & motion | Mobile prototype · Animated video · Sound effects |
| Design systems | Create design system · Use design system · Design system preview · Design Components (.dc.html) · Make tweakable |
| Import sources | Figma .fig (offline decode) · GitHub repo · Existing HTML/CSS |
| Export & handoff | Standalone HTML · PDF · PPTX (editable) · PPTX (screenshots) · Video (MP4) · Send to Figma · Send to Canva · Handoff to Claude Code |
| AI assets & integration | Gemini image generation · Call Claude from prototypes · Read PDF |
Starter components (in starter-components/) save the agent from hand‑rolling the basics: iOS / Android / macOS / browser frames, a pan‑zoom design canvas, a slide‑deck stage, a timeline animation engine, a tweaks panel, and a fillable image slot.
How it works
The skill is plain Markdown plus a few JSX/JS scaffolds — no build step, no runtime.
skills/baoyu-design/
├── SKILL.md # Entry point — orchestrates the whole flow
├── system-prompt.md # The design methodology & craft standards (source of truth)
├── references/
│ ├── claude.md # Tool map for Claude Code
│ ├── cursor.md # Tool map for Cursor
│ └── codex.md # Tool map for Codex Agent
├── built-in-skills/ # Specialized prompts (decks, mobile, import, export, …)
└── starter-components/ # Device frames, deck stage, canvas, animation engine, …
When you ask for a design, the agent reads SKILL.md, loads the core methodology from system-prompt.md, detects whether it’s running in Cursor, Claude Code, Codex Agent, or a generic file‑capable harness, and reads the matching reference doc when one exists. It then pulls in only the built‑in skill(s) the task needs. The split keeps craft rules harness‑independent while each environment resolves its own tools for asking questions, previewing, screenshotting, and verifying.
Quick start
Prerequisites
- A local agent — Cursor, Claude Code, Codex, or any of the 70+ agents the installer supports (Cline, Roo Code, GitHub Copilot…). Cursor, Claude Code, and Codex have first‑class tool references inside the skill.
- Claude Opus 4.8 selected as the model, for best results.
- Node.js (to run the
npxinstaller below). Python 3 is also handy for the local preview server.
Install
Recommended — the skills CLI. npx skills (from Vercel Labs) reads this repo, finds skills/baoyu-design/, and drops it into the right folder for whatever agent it detects:
# Install into the current project (auto‑detects your agent)
npx skills add JimLiu/baoyu-design
# …or install globally, for every project
npx skills add JimLiu/baoyu-design -g
# Target a specific agent explicitly
npx skills add JimLiu/baoyu-design --agent claude-code
npx skills add JimLiu/baoyu-design --agent cursor
npx skills add JimLiu/baoyu-design --agent codex
# Just list what's in the repo first
npx skills add JimLiu/baoyu-design --list
It installs to .claude/skills/ for Claude Code and .agents/skills/ for Cursor/Codex-style agents (add -g for the ~/‑level user install).
Alternative — hand the repo URL to your agent. Don’t want to install anything? Paste the URL into chat and let the agent fetch the skill itself:
Read https://github.com/JimLiu/baoyu-design and follow its
skills/baoyu-design/SKILL.mdto design a settings screen for a meditation app.
The agent clones or fetches the repo, loads SKILL.md, and proceeds — perfect for a one‑off.
Use it
Once the skill is installed (or fetched), just describe a design task in plain language — it auto‑activates from its description:
Design 3 hi‑fi variations of a settings screen for a meditation app.
In Claude Code you can also trigger it explicitly with /baoyu-design; in Codex, mention $baoyu-design when skills are available. The agent asks a few clarifying questions, builds the HTML under designs/, and previews it over localhost. Point at any element in the live preview and say what to change — the agent edits the underlying source for a fast, visual second pass.
Preview server
Deliverables are previewed over HTTP (multi‑file prototypes won’t load from file://). The agent normally starts this for you; to run it by hand:
python3 -m http.server 4311 --directory designs
# then open http://localhost:4311/<project>/<file>.html
Design systems
Beyond one‑off mockups, the skill can hold a whole project to a design system — a versioned bundle of a brand’s tokens, fonts, components, and full UI kits. Systems live next to your projects under designs/: author one with the Create design system built‑in skill, drop in a pre‑built one, or import one from a Figma .fig file (see the next section). Once a system exists, two flows let any project consume it.
Import an existing design system
When you start a design, the agent asks where to save it and which design system(s) to use — it discovers every system under designs/ and lists them, so you can pick none (free design), one, or several. Name one up front and it skips the menu:
Design a settings screen using the Fluent 2 design system.
For each system you choose, the agent syncs a self‑contained, version‑pinned copy into your project at _ds/<slug>/, wires its CSS and component bundle into the page, and records the binding in the project’s _d_meta.json. That local copy is what keeps the project portable and reproducible — nothing reaches outside the folder, and re‑running the import is how you pull updates later. Choose several systems and one becomes primary — it owns the overall look and wins any token collision, while the others lend specific components.
Use an imported design system
Once a system is bound it acts as a binding visual contract, not a loose suggestion: every screen is built from the system’s real tokens, type, spacing, and components, and the agent won’t invent off‑system colors or styles. If the system ships starting points — ready‑made screens or components — you can seed a new design from one instead of starting blank.
The binding travels with the project. Reopen it later and the agent reads _d_meta.json, reloads the system, and keeps designing in‑style — no need to re‑pick. From there you can refresh a system to pull updates, add another, swap which one is primary, or remove one entirely.
Import design sources
Real context beats description. Three built‑in import skills turn material you already have into design ground truth the agent works from directly:
- Figma
.figfiles — decoded fully offline. Export any Figma file as.fig(or grab a community kit), point the agent at it, and a vendored decoder reads it right on your machine — no Figma account, API token, or MCP server. The agent inventories pages, components, and variables, confirms scope with you, then either cherry‑picks components as React code or emits the whole kit as a design system: components regrouped semantically, curated token CSS, real SVG/PNG assets extracted from the file (copied, never redrawn), guideline cards, and a brand‑guide README. - GitHub repos. Hand it a repo URL as a design source. The agent browses the tree with
gh apibefore cloning anything, sparse‑checkouts only the paths it needs into a scratch dir outside your project, and records the repo URL as provenance. - Existing HTML/CSS. Saved pages or a local codebase work as a design reference: the agent reads the real stylesheets instead of squinting at screenshots, lifts exact values (colors, type, spacing, radii, shadows, interaction states) into your project’s own custom properties, and copies referenced assets out.
Anything imported as a design system compiles into a single self‑contained, interactive preview.html. This is the community Chakra UI Figma Kit .fig after one import conversation — 28 components in semantic groups, 400+ curated tokens, guideline cards, and a self‑authored showcase, all browsable in one file:
Decks & PPTX export
Ask for a slide deck and the skill builds it the same way it builds everything else — as a self‑contained HTML page, using the deck-stage component and the Make a deck built‑in skill. Because the deck is just a web page, the whole editing loop stays where you already are:
- Preview and tweak by pointing. Open the deck in your agent’s browser preview (Cursor Browser, Codex Browser, or Claude Preview), point at a headline or a chart and say what to change — or just ask in chat — and the agent edits that slide’s source.
- Present full‑screen. Press
F(or click the fullscreen button in the deck toolbar) to present the deck full‑screen, with the thumbnail rail hidden.Cmd/Ctrl+Fis left untouched, so the browser’s Find still works.
When the deck is ready, export it to PowerPoint by saying so in the same conversation — “export this to PPTX”, “export to PowerPoint”, or “做个 PPT” all route to the export flow (it only exports decks this skill builds, not arbitrary HTML). Two modes:
- Editable — native PowerPoint text, shapes, and images you can keep editing in PowerPoint or Keynote, laid out to closely match the web deck.
- Screenshots — one full‑bleed PNG per slide: pixel‑perfect, but flat.
Running locally: on
claude.ai/design, PPTX export is a built‑ingen_pptxtool — and that tool isn’t there when you run the skill on your own agent. So the skill ships its own: a local CLI (agents/gen-pptx/) that drives headless Chromium via Playwright and writes the.pptxwith PptxGenJS. In Claude Code it runs after a one‑time build (cd skills/baoyu-design/agents/gen-pptx && npm install && npx playwright install chromium && npm run build); from then on the agent serves the deck and invokes the CLI for you.
How gen_pptx works
The core idea: don’t parse HTML — render it in a real browser, then translate the result into PowerPoint.
The CLI launches a headless Chromium via Playwright and loads the deck as a live web page (which is why it needs an http:// URL, not file://). A capture bundle is injected into the page and exposed as window.__genpptx; the Node driver talks to it through page.evaluate() — everything that needs the real browser (layout, computed styles, font metrics, image decoding) runs in‑page and returns pure data to Node.
Before any slides are captured, setup() hides UI chrome, applies font substitutions (injecting @font-face rules or fetching from Google Fonts), undoes transform: scale() wrappers so measurements reflect authored dimensions, waits for document.fonts.ready, and collects speaker notes.
Then slides are processed one by one — showJs navigates to each slide, a delay lets transitions settle, and images are .decode()’d. From here the two modes diverge:
- Screenshot mode simply calls
page.screenshot()at 2× device scale and drops each PNG as a full‑bleed slide image. Pixel‑perfect, but flat. - Editable mode recursively walks the live DOM, serializing every element into a
{ tag, rect, style, children }JSON tree with pixel‑precise bounding boxes (text usesRange.getBoundingClientRect()). Back in Node,renderNodeToPptxtranslates each node into native PptxGenJS objects: backgrounds and borders becomeaddShape, text becomesaddTextwith the exact font/size/color from computed styles, images (including rasterized SVGs and canvas snapshots) becomeaddImage. Coordinates convert atpx ÷ 96 = inches, font sizes atpx × 0.75 = points.
Finally, a validation pass compares captures against the input — djb2‑hashing consecutive slides to flag navigation failures, checking slide dimensions, and verifying speaker‑note counts — and the result is printed as a single JSON line for the agent to read.
Example prompts
- “Design 3 hi‑fi variations of a pricing page using the brand in this screenshot.”
- “Prototype a working onboarding flow — real state, transitions, form validation.”
- “Make a 10‑slide deck from this PRD for an engineering all‑hands.”
- “Wireframe a few layout ideas for a mobile expense‑tracker home screen.”
- “Recreate the composer UI from this codebase, then export it as standalone HTML.”
- “Turn this UI kit
.figinto a design system, then build a dashboard with it.” - “Build a dashboard using our design system, starting from its analytics screen.”
For best results, give it design context — a screenshot, a UI kit, a Figma .fig export, or a codebase. Starting from real context is the single biggest lever on quality; the skill will ask for it if you don’t provide it.
Credits & license
This project repackages Claude Design, the design skill by Anthropic that powers claude.ai/design, so it can run on local agents. It is an independent, community effort and is not affiliated with or endorsed by Anthropic.
Repackaged and maintained by Jim Liu 宝玉. Released under the MIT License.
宝玉 (@dotey): baoyu-design skill(让你本地运行 Claude Design 的 Skill)更新,本地生成 PPT 的效果,可以借助 Cursor、Codex 内置的浏览器预览 PPT,也可以直接用它们的标记工具修改 PPT 细节。按 F 键可以全屏播放。
还可以导出成可以编辑的 PPTX
相似文章
@dotey: 之所以研究 Claude Design,是因为最近摸索出一套不错的开发模式: 1. 先用 Claude Design 去设计 App 的 UI 和 UX,第一版本越简单越好。 Claude Design 交付的结果是 HTML + CSS…
作者分享了一套结合 Claude Design 和本地 AI agent(如 Cursor、Codex)的开发流程,并开源了 baoyu-design 工具,使开发者无需依赖网页版即可在本地运行 Claude Design 进行 UI 设计。
@Easycompany333: 整理了 6 个可以直接试的视频类 Claude Skills: 1. HyperFrames 一句话生成动效视频,文章、推文、产品介绍都能变成 MP4。 适合产品宣发、教程开场、社交短视频。 https://github.com/heyg…
整理了6个可直接使用的视频类Claude Skills,涵盖自动生成动效视频、AI辅助粗剪、React组件渲染视频、多媒体生成工具箱、中文剪辑Agent和视频提示词编写等开源工具。
@berryxia: 卧槽!老黄牛逼啊! 这下直接收藏当可以直接拿html轻松制作视频了,今天正好也在拿hyperframe和remotion捣鼓视频。 这下可以直接拿来用了,直接送枕头了啊! 地址 https://github.com/nexu-io/ope…
Open Design 是一个开源的 Claude Design 替代品,支持通过 HTML 生成视频、原型和仪表盘,集成多种 AI 代理,实现本地优先的设计工作流。
@Saccc_c: 发现一个动画生成神器,直接把自然语言转化成酷炫Lottie动效 下面视频演示的效果都可以一次直出,很牛逼 (提示词指南放评论区了,可以直接喂给AI帮你写好高质量提示词) 把它和HyperFrames这类视频制作工具结合,又会有不少玩法 终…
一个开源工具,利用AI代理(如Claude Code/Codex)将自然语言直接转化为可生产的Lottie动效,支持一键安装和提示词指南。
@0xluffy_eth: 有人为Claude Code开发了免费视频编辑工具...太疯狂了。 只需把原始素材和资源放入文件夹。 就这样。 它会处理一切: - 剪辑片段 - 移除冗余词 - 添加字幕 - 应用色彩分级和滤镜 - 处理动画 - 渲染最终视频 无时间线。…
A free, open-source video editing tool built for Claude Code that fully automates editing from raw footage—clipping, filler word removal, subtitles, color grading, animation, and final rendering—all without a timeline or manual edits.



