@dotey: baoyu-design skill can generate animated videos locally and export them as MP4, just like the attached video. The Claude Design web version can create animated videos using prompts, but you need to be on the webpage to see them—no download. Some have tried using third-party screen recording plugins, …

X AI KOLs Timeline 06/17/26, 12:21 AM Tools

baoyu-design claude-design open-source animation-video mp4-export local-agent agent-skill

Summary

baoyu-design is an open-source project that packages Claude Design as a local Agent Skill, supporting generation of animated videos and direct export as MP4 in environments like Cursor and Claude Code. It achieves precise, high-quality exports by rendering frame-by-frame using a headless browser.

baoyu-design skill can generate animated videos locally and export them as MP4, just like the attached video. Claude Design web version can create animated videos using prompts, but you need to be on the webpage to see them—there's no download. Some have tried using third-party screen recording plugins, but it's cumbersome and the results aren't great. On top of this Skill, I've implemented the ability to directly export animations as MP4 videos. The principle behind generating animations and exporting videos is actually similar: each frame is calculated based on its time coordinate, like a function f(t). You pass any time point t to the animation engine, and it directly calculates the position, opacity, and size of every element on screen at that instant—all visual states are entirely determined by t. There's no need to play from the beginning to that moment, nor to remember what happened before. Traditional animation creation is imperative: at a certain moment, you change some element's position, adjust its opacity. The state is scattered, and if the timing goes wrong, the screen becomes a mess. This engine flips that—using a declarative approach: you don't drive element movements; you only describe what each element should look like at second t. Think of it this way: traditional animation is like watching a movie—you have to watch from the beginning to know what the frame at minute 30 looks like. This engine is more like a special book where, no matter which page you open, the scene is complete and deterministic. This design brings three capabilities: 1. You can drag the playhead to jump to any position, because f(t) can compute at any time; 2. You can repeatedly debug the same frame, because the same t always produces the same frame; 3. You can also export animations as video, and the method is quite clever. So how is video export actually implemented? Intuitively, to turn browser animations into MP4, why not just screen record? But screen recording is real-time. If the machine stutters, you get dropped frames, and the resulting video isn't reproducible. Plus, things like the playbar, black background, rounded corners, and shadows—the 'player chrome'—get recorded along with it. The method baoyu-design uses is more like stop-motion photography: it launches a headless browser (Chromium without UI), loads the animation page, and uses a control interface reserved in the engine to precisely manipulate the timeline. Each time a time point is set, it waits for the browser to finish rendering, takes a screenshot, pipes it directly to ffmpeg for encoding. A 95-second animation at 30fps means 2850 cycles of 'set the time, take a picture.' It's slow, but every frame is exact—no dropped frames. There's an easily overlooked detail: after setting the time, the tool waits for two requestAnimationFrame calls before taking the screenshot. Because changing the time only updates React state, the browser needs one to two frames to actually paint the new scene on screen. If you wait too few, you might capture a residual image from the previous frame. To make the image sharper, screenshots are rendered at 2x device pixel ratio, producing images at effectively 3840×2160, which are then scaled down to 1080p by ffmpeg. The principle is the same as high-resolution printing: first draw finely on a larger canvas, then downsample with high quality—text edges and fine lines become noticeably clearer. baoyu-design is open-sourced on GitHub (MIT license), currently with 1.2K stars. If interested, check out its skills/baoyu-design directory—the complete animation engine implementation is there. https://github.com/jimliu/baoyu-design…

Original Article

View Cached Full Text

Cached at: 06/17/26, 01:45 AM

baoyu-design skill can generate animated videos locally and export them as MP4, just like the attached video. Claude Design’s web version can create animated videos using prompts, but you can only view the video on the web page—you can’t download it. Some people have achieved this using third-party plugins or screen recording software, but it’s cumbersome and the results aren’t great. Based on this Skill, I’ve already implemented the ability to directly export animations as MP4 videos.

The principle behind generating animations and exporting videos is actually similar: each frame is calculated based on its time coordinate. It’s like a function f(t)—you pass any time point t to the animation engine, and it instantly calculates the position, opacity, and size of every element on the screen at that moment. All visual states are entirely determined by t. There’s no need to play from the beginning to reach that moment, nor to remember what happened before.

The traditional way of writing animations is imperative: at a certain moment, change the position of an element, adjust its opacity. States are scattered everywhere, and if the timing gets messed up, the visuals fall apart. This engine takes the opposite approach—it uses a declarative mindset: instead of driving elements to move, you only describe what each element should look like at time t.

To put it in a metaphor: traditional animation is like watching a movie—you have to start from the beginning to know what the screen looks like at minute 30. This engine is more like a special book: flip to any page, and the image is complete and deterministic.

This design brings three capabilities:

You can drag the playhead to any position because f(t) can calculate at any time.
You can repeatedly debug the same frame, because the same t always produces the same frame.
You can also export the animation as a video, and the method is quite clever.

So how is the video export implemented?

Intuitively, to turn a browser animation into MP4, why not just screen record? Screen recording is real-time. A slight machine lag causes dropped frames, making the video unreproducible. Also, the “player shell” like the play bar, black background, rounded corners, and shadows would all be recorded along with the content.

The method used by baoyu-design is more like stop-motion animation shooting: launch a headless browser (Chromium without a UI), load the animation page, and precisely control the timeline through a reserved control interface of the engine. For each set time point, wait for the browser to finish rendering, take a screenshot, and feed it directly to ffmpeg via a pipe. A 95-second, 30fps animation means 2850 cycles of “set the time, take a picture.” It’s slow, but every frame is precise—no dropped frames.

There’s an easily overlooked detail: after setting the time, the tool waits for two requestAnimationFrame cycles before taking the screenshot. This is because modifying the time only changes the React state, and the browser needs one or two frames to actually paint the new image on the screen. If you wait too little, you might capture residual artifacts from the previous frame.

To make the image sharper, the screenshot is rendered at 2x device pixel ratio, producing output at 3840×2160, and then ffmpeg scales it down to 1080p. The principle is the same as high-resolution printing: first draw finely on a larger canvas, then downscale with high quality—text edges and thin lines become noticeably clearer.

baoyu-design is open source on GitHub (MIT license), currently with 1.2K stars. If interested, check out the skills/baoyu-design directory, which contains the full implementation of the animation engine. https://github.com/jimliu/baoyu-design…

jimliu/baoyu-design

Source: https://github.com/jimliu/baoyu-design

baoyu-design

Run Claude Design on your own local agent — Cursor, Claude Code, Claude Desktop, or any file‑capable coding agent.

English · 简体中文 · Changelog License: MIT

Best with Opus 4.8

Harness baoyu-design packages Claude Design — the design engine behind claude.ai/design (https://claude.ai/design) — as a portable Agent Skill. Drop it into a local agent and you get most of what the website does, right inside your editor: polished UI mockups, interactive prototypes, wireframes, landing pages, dashboards, mobile apps, and slide decks — all produced as self‑contained HTML. No website, no separate subscription, no upload step. The agent already on your machine does the work, and every artifact stays in your repo.

Screenshots

The same Reader Mac App prompt was used in Cursor, Codex, Claude, and Claude Design.

Cursor	Codex	Claude	Claude Design

Prompt used for all screenshots ``markdown Build a Reader Mac app that helps me read and save articles better. All data is stored locally.

Information collection

Manual adding Support manually adding different types of information:
- URL: enter a URL and automatically fetch content and images
- Attachments: upload PDFs, videos, and images
- Markdown editing: like publishing a blog post, enter the title, body, and cover image
- Other
Automatic subscriptions
- RSS feeds
- Social media accounts: X, Weibo, YouTube
- Other

Editing and organization

Tags Every item can have tags.
Categories and folders Create tree-structured folders and place content in different categories.
Favorites Users can click to favorite an item.
Editing Every item can be edited with a built-in Markdown editor.

AI assistance

Automatic translation Support translation across different languages.
Summaries and abstracts Generate summaries for captured content.
Derivative creation Create new work based on one or more pieces of content.
Integrated AI Chat Use AI Chat to call AI Agents that help process content. ``

Why run it locally

Free yourself from the website. You get the vast majority of claude.ai/design’s capabilities without ever leaving your editor — same methodology, same craft standards, same output format.
Best with Opus 4.8. The skill is a long, demanding design brief; the stronger the model, the better the result. Pair it with Claude Opus 4.8 for the best output, and it still works well on other capable models.
Iterate by pointing, not describing. Because the deliverable is plain HTML served on localhost, you can lean on your agent’s built‑in browser preview and element‑annotation tools (Cursor Browser / DevTools, Claude Preview, or Codex Browser). Point at a button in the live preview, say what you want changed, and the agent edits the underlying source — a tight, visual second‑pass editing loop that’s hard to get on a website.
Everything is yours. Output lands in designs/<project>/ as self‑contained HTML you can version, fork, export, or ship.

What it can make

The skill drives a full design process — clarifying questions → gathering design context → producing one or more HTML deliverables → previewing and verifying. It ships a deep bench of built‑in skills and a set of ready‑made component scaffolds.

Area	Built‑in skills
Core design	Hi‑fi design · Interactive prototype · Wireframe · Frontend aesthetic direction
Decks	Make a deck · Speaker notes
Mobile & motion	Mobile prototype · Animated video · Sound effects
Design systems	Create design system · Use design system · Design system preview · Design Components (`.dc.html`) · Make tweakable
Import sources	Figma `.fig` (offline decode) · GitHub repo · Existing HTML/CSS
Export & handoff	Standalone HTML · PDF · PPTX (editable) · PPTX (screenshots) · Video (MP4) · Send to Figma · Send to Canva · Handoff to Claude Code
AI assets & integration	Gemini image generation · Call Claude from prototypes · Read PDF

Starter components (in starter-components/) save the agent from hand‑rolling the basics: iOS / Android / macOS / browser frames, a pan‑zoom design canvas, a slide‑deck stage, a timeline animation engine, a tweaks panel, and a fillable image slot.

How it works

The skill is plain Markdown plus a few JSX/JS scaffolds — no build step, no runtime.

skills/baoyu-design/
├── SKILL.md                       # Entry point — orchestrates the whole flow
├── system-prompt.md               # The design methodology & craft standards (source of truth)
├── references/
│   ├── claude.md                  # Tool map for Claude Code
│   ├── cursor.md                  # Tool map for Cursor
│   └── codex.md                   # Tool map for Codex Agent
├── built-in-skills/               # Specialized prompts (decks, mobile, import, export, ...)
└── starter-components/            # Device frames, deck stage, canvas, animation engine, ...

When you ask for a design, the agent reads SKILL.md, loads the core methodology from system-prompt.md, detects whether it’s running in Cursor, Claude Code, Codex Agent, or a generic file‑capable harness, and reads the matching reference doc when one exists. It then pulls in only the built‑in skill(s) the task needs. The split keeps craft rules harness‑independent while each environment resolves its own tools for asking questions, previewing, screenshotting, and verifying.

Quick start

Prerequisites

A local agent — Cursor (https://cursor.com), Claude Code (https://claude.com/claude-code), Codex (https://developers.openai.com/codex/), or any of the 70+ agents the installer supports (Cline, Roo Code, GitHub Copilot…). Cursor, Claude Code, and Codex have first‑class tool references inside the skill.
Claude Opus 4.8 selected as the model, for best results.
Node.js (to run the npx installer below). Python 3 is also handy for the local preview server.

Install

Recommended — the skills CLI.

npx skills (https://github.com/vercel-labs/skills) (from Vercel Labs) reads this repo, finds skills/baoyu-design/, and drops it into the right folder for whatever agent it detects:

# Install into the current project (auto‑detects your agent)
npx skills add JimLiu/baoyu-design

# ...or install globally, for every project
npx skills add JimLiu/baoyu-design -g

# Target a specific agent explicitly
npx skills add JimLiu/baoyu-design --agent claude-code
npx skills add JimLiu/baoyu-design --agent cursor
npx skills add JimLiu/baoyu-design --agent codex

# Just list what's in the repo first
npx skills add JimLiu/baoyu-design --list

It installs to .claude/skills/ for Claude Code and .agents/skills/ for Cursor/Codex-style agents (add -g for the ~/‑level user install).

Alternative — hand the repo URL to your agent.

Don’t want to install anything? Paste the URL into chat and let the agent fetch the skill itself:

Read https://github.com/JimLiu/baoyu-design and follow its skills/baoyu-design/SKILL.md to design a settings screen for a meditation app.

The agent clones or fetches the repo, loads SKILL.md, and proceeds — perfect for a one‑off.

Use it

Once the skill is installed (or fetched), just describe a design task in plain language — it auto‑activates from its description:

Design 3 hi‑fi variations of a settings screen for a meditation app.

In Claude Code you can also trigger it explicitly with /baoyu-design; in Codex, mention $baoyu-design when skills are available.

The agent asks a few clarifying questions, builds the HTML under designs/, and previews it over localhost.

Point at any element in the live preview and say what to change — the agent edits the underlying source for a fast, visual second pass.

Preview server

Deliverables are previewed over HTTP (multi‑file prototypes won’t load from file://). The agent normally starts this for you; to run it by hand:

python3 -m http.server 4311 --directory designs
# then open http://localhost:4311/<project>/<file>.html

Design systems

Beyond one‑off mockups, the skill can hold a whole project to a design system — a versioned bundle of a brand’s tokens, fonts, components, and full UI kits. Systems live next to your projects under designs/: author one with the Create design system built‑in skill, drop in a pre‑built one, or import one from a Figma .fig file (see the next section). Once a system exists, two flows let any project consume it.

Import an existing design system

When you start a design, the agent asks where to save it and which design system(s) to use — it discovers every system under designs/ and lists them, so you can pick none (free design), one, or several. Name one up front and it skips the menu:

Design a settings screen using the Fluent 2 design system.

For each system you choose, the agent syncs a self‑contained, version‑pinned copy into your project at _ds/<name>/, wires its CSS and component bundle into the page, and records the binding in the project’s _d_meta.json. That local copy is what keeps the project portable and reproducible — nothing reaches outside the folder, and re‑running the import is how you pull updates later.

Choose several systems and one becomes primary — it owns the overall look and wins any token collision, while the others lend specific components.

Use an imported design system

Once a system is bound it acts as a binding visual contract, not a loose suggestion: every screen is built from the system’s real tokens, type, spacing, and components, and the agent won’t invent off‑system colors or styles. If the system ships starting points — ready‑made screens or components — you can seed a new design from one instead of starting blank.

The binding travels with the project. Reopen it later and the agent reads _d_meta.json, reloads the system, and keeps designing in‑style — no need to re‑pick. From there you can refresh a system to pull updates, add another, swap which one is primary, or remove one entirely.

Import design sources

Real context beats description. Three built‑in import skills turn material you already have into design ground truth the agent works from directly:

Figma .fig files — decoded fully offline. Export any Figma file as .fig (or grab a community kit), point the agent at it, and a vendored decoder reads it right on your machine — no Figma account, API token, or MCP server. The agent inventories pages, components, and variables, confirms scope with you, then either cherry‑picks components as React code or emits the whole kit as a design system: components regrouped semantically, curated token CSS, real SVG/PNG assets extracted from the file (copied, never redrawn), guideline cards, and a brand‑guide README.
GitHub repos. Hand it a repo URL as a design source. The agent browses the tree with gh api before cloning anything, sparse‑checkouts only the paths it needs into a scratch dir outside your project, and records the repo URL as provenance.
Existing HTML/CSS. Saved pages or a local codebase work as a design reference: the agent reads the real stylesheets instead of squinting at screenshots, lifts exact values (colors, type, spacing, radii, shadows, interaction states) into your project’s own custom properties, and copies referenced assets out.

Anything imported as a design system compiles into a single self‑contained, interactive preview.html. This is the community Chakra UI Figma Kit .fig after one import conversation — 28 components in semantic groups, 400+ curated tokens, guideline cards, and a self‑authored showcase, all browsable in one file:

Decks & PPTX export

Ask for a slide deck and the skill builds it the same way it builds everything else — as a self‑contained HTML page, using the deck-stage component and the Make a deck built‑in skill. Because the deck is just a web page, the whole editing loop stays where you already are:

Preview and tweak by pointing. Open the deck in your agent’s browser preview (Cursor Browser, Codex Browser, or Claude Preview), point at a headline or a chart and say what to change — or just ask in chat — and the agent edits that slide’s source.
Present full‑screen. Press F (or click the fullscreen button in the deck toolbar) to present the deck full‑screen, with the thumbnail rail hidden. Cmd/Ctrl+F is left untouched, so the browser’s Find still works.

When the deck is ready, export it to PowerPoint by saying so in the same conversation — “export this to PPTX”, “export to PowerPoint”, or “做个 PPT” all route to the export flow (it only exports decks this skill builds, not arbitrary HTML). Two modes:

Editable — native PowerPoint text, shapes, and images you can keep editing in PowerPoint or Keynote, laid out to closely match the web deck.
Screenshots — one full‑bleed PNG per slide: pixel‑perfect, but flat.

Running locally: on claude.ai/design, PPTX export is a built‑in gen_pptx tool — and that tool isn’t there when you run the skill on your own agent. So the skill ships its own: a local CLI (agents/gen-pptx/) that drives headless Chromium via Playwright and writes the .pptx with PptxGenJS. In Claude Code it runs after a one‑time build (cd skills/baoyu-design/agents/gen-pptx && npm install && npx playwright install chromium && npm run build); from then on the agent serves the deck and invokes the CLI for you.

How gen_pptx works

The core idea: don’t parse HTML — render it in a real browser, then translate the result into PowerPoint.

The CLI launches a headless Chromium via Playwright and loads the deck as a live web page (which is why it needs an http:// URL, not file://). A capture bundle is injected into the page and exposed as window.__genpptx; the Node driver talks to it through page.evaluate() — everything that needs the real browser (layout, computed styles, font metrics, image decoding) runs in‑page and returns pure data to Node.

Before any slides are captured, setup() hides UI chrome, applies font substitutions (injecting @font-face rules or fetching from Google Fonts), undoes transform: scale() wrappers so measurements reflect authored dimensions, waits for document.fonts.ready, and collects speaker notes. Then slides are processed one by one — showJs navigates to each slide, a delay lets transitions settle, and images are .decode()’d. From here the two modes diverge:

Screenshot mode simply calls page.screenshot() at 2× device scale and drops each PNG as a full‑bleed slide image. Pixel‑perfect, but flat.