@aiwarts: https://x.com/aiwarts/status/2070137056144019720

X AI KOLs Timeline Tools

Summary

Open-sourced Humanize PPT, an HTML PPT skill designed for presentations. It reorganizes outlines using AST (Audience, State, Transfer) logic, combined with downstream rendering skills to produce PPTs truly usable for presentations, addressing issues like uneven information distribution and content emptiness.

https://t.co/Lcva9UMHGZ
Original Article
View Cached Full Text

Cached at: 06/26/26, 08:08 AM

Open Source a PPT Skill Built for Speeches – No More Staring at the Audience in Awkward Silence

For the past three months, I’ve been giving offline presentations using HTML-based PPTs.

The workflow was painfully repetitive: I’d have Codex generate 30 slides in one go, then polish each one individually. Too much text looked bad, too little left the page empty. If I needed images, I’d have to generate them separately with GPT Image2 and then plug them in.

The bottom line: most HTML PPT skills simply aren’t built for live speeches.

And honestly, that’s not their fault.

I tried pairing different color schemes with different chapters. I summarized several common PPT structures — split left/right, top/bottom, three-column layouts, four-square grids. If it could be tweaked in HTML PPT, I probably tried it.

It felt like cooking the same potato twice with the same ingredients — you can’t get a second flavor.

The problem isn’t the number of templates. If you look closely, it’s the upstream agent’s uneven distribution of information across slides.

How does the upstream gather and organize information? How does it allocate content to each slide? Does it check quality mid-way (duplicate titles, overlapping fonts, tiny text that nobody reads just to make it look pretty)? Can the finished speech script be split across slides?

None of these steps are handled by any existing Skill.

That’s exactly why I built Humanize PPT v0.9!

I solved all of these problems in one go. Miss you always, Claude Fable5.

🔗 github.com/LearnPrompt/humanize-ppt

Starting from v0.7, I first drew a clear boundary for Humanize PPT: outsource the actual PPT rendering to downstream Skills.

Humanize PPT’s job is to organize the outline, per-slide intent, placeholder positions for videos and images, and the speech script into structured JSON and Markdown, then hand it over to downstream Skills for native rendering.

So theoretically, it can adapt to any HTML PPT Skill. To be rigorous, I tested two complete workflows:

  • For Chinese: connect it to guizang-ppt-skill.
  • For English: connect it to frontend-slides or beautiful-html-templates.

The sneakiest pitfall when using AI for PPTs is that the slide count spirals out of control.

A one-hour presentation often has 30+ slides. Each slide has to carry information.

Either push the audience toward a judgment, explain a blocker, or make them suddenly realize, “Oh, I never thought of it that way.”

That’s why I made content structuring and per-slide information allocation the foundation of Humanize PPT — AST: Audience, State, Transfer.

A is Audience. Who’s listening? What do they already know? Why should they keep listening?

S is State. What state is the audience in before they see your PPT, and what state should they be in afterward?

T is Transfer. Each slide’s task isn’t to display information — it’s to push the audience from one state to the next.

The first thing Humanize PPT does before rendering is reorganize your material using the AST logic.

It outputs an outline where every slide is annotated with: the audience’s state when entering, the intent of that slide, and the key takeaway they should leave with.

I actually tested this: I fed the same topic material into two pipelines. On the left, I sent it directly to guizang ppt skill, which split it into Background, Pain Points, Solution. On the right, I first ran it through Humanize’s AST pipeline, then sent it to the same guizang visual system.

Visually, the left side wasn’t ugly. It even passed guizang’s Swiss validator and static speech check.

But it still looked more like a neatly organized folder.

On the right, the slide labels changed from “Background”, “Pain Points”, “Solution” to something that tells a real story — because real speeches aren’t concept showcases. AST first decides how the audience will change.

The rendering engine didn’t change. The difference came from upstream.

Once the outline is finalized, Humanize PPT doesn’t immediately ask the downstream to write the entire deck in one shot.

Instead, it renders four real slides from the same content so you can preview the final look in advance.

Not a color palette. Not a few abstract style descriptions.

You see the actual HTML pages that the downstream Skill will produce — font layout, information density, whether the formatting and animations match your expectations. All visible at a glance.

After you pick one style, the Agent continues building the remaining 20–30 slides within that visual system.

This way, you don’t have to wait until the entire PPT is done to realize that the direction was wrong from the start.

The way images and videos are placed has also completely changed.

Before, when Humanize output an outline, it would say “this slide probably needs an image.”

With v0.9, it doesn’t just say “needs” anymore. The placement of the asset, the file name, and even the prompt to generate it — all are written into the outline.

baoyu-image-gen connects to GPT Image2 for images; remotion-video-production handles videos.

Later, I discovered that creating a flowchart in SVG first, then converting it to Remotion, yields even better results.

So I integrated that workflow into v0.9 as well.

I also tried another path before: forcing someone else’s style and layout into my own framework, then dragging in 30–40 templates.

But the end result was a mess — I could only achieve about 80% of the intended effect after breaking free from the original framework’s constraints.

The Frankenstein problem wasn’t even the worst.

I also produced a bunch of slides that simply couldn’t be presented — low information density, needing 4–5 slides just to get one point across.

Let me show you some examples.

Take slide 3: just three words and a full-screen background image. Pretty, sure. But what do I say? The truth is, during a speech, the audience often zones out. I believe that if a slide — after someone looks away and back — fails to communicate any clear main content, it’s a failure.

Slide 5: a whole screen crammed with text. I’m standing there, the audience and I are just staring at each other in silence. If I have to read the text aloud, I might as well print it out and hand it to everyone.

Slide 8: title says “Technical Architecture,” underneath an unfilled TODO placeholder, with no image or video assets.

After rendering the PPT, the next step is quality control and assigning the speech script to each slide.

The deck output by Humanize PPT supports presentation mode.

Press the S key to switch — the speech script appears in a separate window with page numbers and notes.

No need to read from paper, no need to memorize. You look at the audience, and the notes follow along on another screen.

Press S to enter presentation mode, speech script in an independent window with page numbers and notes.

Press ESC to open the global index — all pages shrink into an overview. Forgot where you are? Need to jump to a slide on the fly? No need to flip back one by one — just find the page and click to switch directly.

During my presentations, I also noticed common rendering issues in HTML PPTs.

So before the speech, I added a quality check step.

For example, this PPT was purely rendered by beautiful-html-templates, style Neo-Grid Bold, 11 slides total.

If I first ran a static scan on the HTML page, it passed with all green — 0 fails, 0 warnings.

But when I manually reviewed the screenshots slide by slide…

I was left speechless.

The page number in the bottom-left corner ate half of the last line of the main text.

What the audience actually saw was “uires confirmation.” But the original sentence was “What requires confirmation.”

That’s just amateurish.

But after adding the Humanize PPT pipeline, the “What requires confirmation.” was displayed in full, and not a single pixel of the visual system was changed.

I also compiled a comprehensive list of common HTML PPT bugs and fixed them all.

For a long time, Humanize PPT took the wrong path —

obsessed with making AI produce prettier slides.

Now, what I really want to do is:

make AI-generated PPTs that can actually be used for speeches.

Pretty is important, of course.

But in a speech, what determines whether a PPT is useful

is whether the audience moves forward with you when you turn the page.

That’s Humanize PPT.

A PPT Skill built for speeches.

Pushing PPTs from “looks premium”

to “gives you something to really talk about.”

That awkward feeling when the slides look amazing

but you have no idea what to say on stage.

That frustration when the code is fine

but the projection shows half a line missing.

That exhaustion when you have to screenshot slide after slide

and tell the Agent “look at page X, line Y.”

I want none of that.

Let the Skill handle it.

Similar Articles

@yaohui12138: Recently, big shots like guizang, zarazhangrui, and Hua Shu have open-sourced PPT Skills. I spent a day testing 7 PPT skill projects and here's my conclusion: The core reason why AI-generated PPTs look ugly is not that AI lacks ability, but that you haven't given it the right aesthetic constraints...

X AI KOLs Timeline

The author tested 7 open-source PPT Skill projects and points out that the core reason for ugly AI-generated PPTs is the lack of proper aesthetic constraint systems. These projects improve the design quality of AI-generated PPTs by compiling aesthetic rules, and the author introduces each project's features and suitable scenarios.

@KyrieCheungYep: Recommend a PPT creation Skill: ultimate-ppt-master-skill, which is closer to real-world PPT making scenarios. Its features and highlights are as follows: 1. First, clarify requirements. Many AI PPT tools start making slides immediately, resulting in nothing but empty words. PPT M…

X AI KOLs Timeline

Recommend an open-source PPT creation tool, ultimate-ppt-master-skill. It follows a workflow of clarifying requirements, organizing material structure, generating editable PPTX or Web Deck, image generation, and final review, which is close to real office scenarios. It solves common issues in AI PPT generation like empty words and lack of editability.

@NFTCPS: Just discovered an open-source project that lets you create PPTs directly using React — the entire workflow is tailor-made for AI Agents. Bro, this approach is way more reliable than hacking together PPTs with HTML. React components are abundant, scalability blows HTML out of the water, easily plug in any chart library, and your technical PPTs will take off. Comes with a visual editor…

X AI KOLs Timeline

Discovered the open-source project open-slide, which uses the React framework to create PPTs. The workflow is specifically designed for AI Agents, supporting natural language slide generation, a visual editor, and brand logo integration.

@vintcessun: Tonight I discovered a project with a novel approach: using GPT to generate luxurious image-format PPTs, then 'decompiling' them back into editable PPTX. Previously, AI-generated PPTs either stacked templates or only exported images. This one directly follows the 'generate image → parse → assemble' path. The core idea is to decompose the PPT image into four layers: background, frame, icons, and text, then reconstruct by coord...

X AI KOLs Timeline

An open-source project named GordenSuperPPTSkills uses GPT to generate luxurious image-format PPTs, then 'decompiles' them into fully editable PPTX files, solving the pain point of AI-generated content being non-editable. The project is split into three independent skills, supporting either just image generation or just image-to-editable conversion, suitable for Codex environment.