@Moting284: https://x.com/Moting284/status/2065292889207558556

X AI KOLs Timeline 06/12/26, 04:39 AM Tools

ai-illustration skill codex illustration-tool xiaohei prompt-engineering workflow

Summary

Introducing a Codex skill called '小黑配图' that generates hand-drawn style illustrations matching the article content, solving the problem of AI illustrations being empty or too PowerPoint-like. The article details the installation, usage, and best practices of the skill, emphasizing cognitive anchor extraction and post-generation quality inspection.

https://t.co/ZK8BT90vK3

Original Article

View Cached Full Text

Cached at: 06/12/26, 08:59 AM

Can Write Articles but Always Stuck on Illustrations? This Skill Solves It Painlessly (Including Installation Guide)

Finished writing an article and about to publish. Open the editor and realize there are still no illustrations.

Open an AI image generation tool, generate a dozen images. Each one looks like a poster, a PPT slide, a decoration — none look like they belong in the body of an article. Generic AI illustrations easily become empty decorations: blue sky, handshake, gears, glowing brain — visually relevant but lacking in memorability. Another type of failure is turning into PPT infographics: nodes, arrows, circles, title text — too neat, too deliberate.

Then I realized the problem isn’t the tool itself, but that there’s no fixed process for creating illustrations. Every time you have to re-explain: don’t make it too cheesy, don’t pile on elements, don’t make it an infographic, don’t just be decoration. Saying these things once is fine, saying them ten times gets exhausting.

The real pain point isn’t “can’t generate images”, but having to re-explain aesthetics, structure, and constraints every single time.

What exactly is a skill? Not a longer prompt, but a reusable workflow manual.

Many people first encounter “skill” and understand it as “a longer prompt”. That’s not wrong, but it misses one layer: a prompt is a temporary request for the current conversation, while a skill is a reusable workflow. Every time you do a similar task in the future, you don’t need to re-explain aesthetics, steps, or quality standards. Just name the skill, and Codex knows to follow that workflow.

Installing a skill is also not complicated at its core: put a folder containing SKILL.md in a location Codex can scan. Think of it as giving Codex a small handbook. Codex usually only reads the title of the handbook, and only opens it to read carefully when it actually encounters a related task.

I suddenly realized: what a skill saves is not generation time, but the cost of repetitive communication.

For beginners, a skill is:

skill = a folder
core file = SKILL.md
purpose = Let Codex follow a fixed workflow for specific tasks
trigger method = Write Use $skillname in the prompt

I highly recommend the “Xiaohei Illustration Skill” that frequently appears in both my short tweets and long articles now. Special thanks to @jinchenma_ai for sharing, and also to @ianneo_ai for creating it!

What can the Xiaohei Illustration Skill do, and what can’t it do?

Its goal is not to randomly generate a pretty picture for an article. Instead, it first reads the article’s judgments, processes, states, or metaphors, then turns one of those “cognitive actions” into a 16:9 white-background hand-drawn body illustration.

It defaults to using the visual IP “Xiaohei”: solid black body, white dot eyes, thin legs, expressionless but serious. Xiaohei must not just be background or an observer; it must perform the core action: push a door, break a wall, carry a stone, flip a book, take an elevator.

If Xiaohei is just standing to the side, the image is wrong. I only realized this after a few uses: Xiaohei isn’t decoration, but the driver of the structure.

For example, “The first million is hard, but compound interest accelerates after that” can be drawn as Xiaohei carrying a stone uphill vs. riding an elevator. “How a record chain becomes a system” can be drawn as Xiaohei gluing a pile of torn paper into an assembly line. “Drafts must be manually edited” can be drawn as Xiaohei crossing out text with a red pen.

But it’s not suitable for commercial-grade key visuals, brand illustrations, or traditional PPT infographics. It also can’t handle complex architecture diagrams or editable vector source files. If you need children’s cartoons, emoji sets, or multi-character complex scenes, it’s also not a good fit.

The strength of the Xiaohei Illustration Skill isn’t the quirky art style. It’s that it first helps you find the cognitive anchor points in the article.

Beginner Installation: The most critical part is copying the correct directory.

The most common trap for beginners is copying the wrong directory. After downloading the GitHub repository, what you actually need to install is not the entire root directory of the repo, but the subdirectory ian-xiaohei-illustrations/.

The complete installation command is:

git clone https://github.com/helloianneo/ian-xiaohei-illustrations.git
cd ian-xiaohei-illustrations
mkdir -p "${CODEX_HOME:-$HOME/.codex}/skills"
cp -R ./ian-xiaohei-illustrations "${CODEX_HOME:-$HOME/.codex}/skills/"

Explanation of each line:

Line 1: Download the GitHub repository to your local machine.

Line 2: Enter the repository directory.

Line 3: Create Codex’s skills folder (if it already exists, this line won’t error).

Line 4: Copy the actual skill folder into it.

If the skill doesn’t appear after installation, don’t immediately suspect installation failure. First, restart Codex. Codex can usually auto-detect new skills, but occasionally needs a restart to recognize them.

I also got stuck here the first time I installed: I had copied correctly, but Codex didn’t recognize it. After restarting, it worked.

After installation, test with this prompt:

Use $ian-xiaohei-illustrations Don't generate images yet. Please tell me how you would approach creating illustrations for a Chinese article.

If Codex responds with a description of this skill’s capabilities, the installation was successful. If there’s no response, first check if you copied the ian-xiaohei-illustrations/ subdirectory (not the entire repo root), if you restarted Codex, and if the SKILL.md file exists in the skills directory.

The safest workflow for beginners: install the correct directory, restart, test the invocation. For short articles, use fewer images; for long articles, find cognitive anchor points; always do a quality check after generation.

How to use it the first time: Test first, then illustrate.

Many people, after installing the skill, immediately say “Help me generate 8 images for this article.” That’s not impossible, but it tends to waste credits and often results in a set of images that look complete but don’t really match the article.

A more stable approach is to first have it do only illustration planning: after reading the article, have it tell you which paragraphs are worth illustrating, what each image should depict, and what action Xiaohei should take in each image. This step seems slow, but it actually saves the most time because it eliminates paragraphs that “don’t need to be drawn.”

The prompt can be written like this:

Use $ian-xiaohei-illustrations Don't generate images yet.
Please analyze the following article and identify where illustrations would be worthwhile. Output a shot list of around 5 images.
For each image, clearly state: which paragraph it follows, theme, core meaning, structural type, what Xiaohei is doing, and suggested Chinese annotation words.

<paste article>

It will output something like this:

Image 1: Placed after the paragraph “The first million is the hardest” Theme: The anti-human uphill climb for the first million Core meaning: Saving money early on is like carrying a rock uphill — slow and tiring Structure type: Contrast (uphill climb vs. flat ground) Xiaohei action: Xiaohei carries a huge rock, climbing uphill with difficulty Suggested annotation words: 100万 (1 million), 爬坡 (uphill climb), 慢 (slow)

This step seems slow, but it’s actually the fastest. Because instead of throwing the article at AI and hoping for the best, you first set the editorial decisions, then move to generation.

If the shot list is confirmed fine, you can directly generate illustrations for the entire article:

Use $ian-xiaohei-illustrations Generate 4 Xiaohei quirky body illustrations for the following article.
Requirements: 16:9 landscape, pure white background, black hand-drawn line art, minimal red/orange/blue Chinese handwritten annotations.
Each image should focus on only one core structure. Don't make PPT infographics.

<paste article>

Real Practice 1: Don’t force images for short texts.

For example, I once had a very short piece about “the first million.” I could have it generate 2 images: one about early savings being like carrying a rock uphill, and one about later compound interest being like riding an elevator.

Before using the skill, I would have thought, “I need at least 4 images to make it look complete,” and then forced in blue sky, gears, arrows, glowing brains. After using it, I realized: short texts need short illustrations; forcing more just dilutes the focus.

When I saw it only extracted 2 images, my first reaction was: finally, I don’t have to force 8 poster-like images.

After generation, I found the second image was judged more stable, while the first had slightly more text, making it suitable for the middle of the article rather than the beginning. This showed that after generation, you don’t just use them directly; you need to sort and select. Then I combined the two images.

Real Practice 2: For long articles, first find cognitive anchor points.

For example, an article titled “Non-tech people can also use AI to build a complete system.” The skill didn’t evenly distribute images by paragraph. Instead, it grabbed three points truly worth illustrating: how a record chain becomes a system, how features grow organically with use, and how to calibrate step by step when encountering errors and verification.

Before using the skill, I mechanically thought “one image per 500 words.” After using it, I found: illustration isn’t about even distribution — it’s about only drawing cognitive turning points.

This decomposition method is more valuable than “help me generate three images” because it first identifies which parts of the article are worth seeing. This step is editorial work, not generation work.

Another article, “How to use Codex to turn an article into notes and tweets in ten minutes,” was broken down into “Don’t let AI summarize it,” “Three-layer processing method,” “Drafts must be manually edited,” and “Solidify into a skill.” Every anchor point was a cognitive turning point, not an average paragraph.

Many people think generation is the end.

Wrong.

After generation, don’t use the images directly. First, go through a quality checklist:

Is it 16:9 landscape on white background?
Is Xiaohei the subject of the action?
Does this image communicate only one core idea?
Are the Chinese annotations short, few, and readable?
Does it look like a PPT / flowchart / course page?
Are there any unnecessary titles, typos, or hallucinated labels?
Is it just replicating example images rather than thinking of new metaphors for the current article?

The easiest to mess up are items 2 and 5.

Xiaohei must perform the core action, not just be decoration. Each image should communicate only one core structure. Text should be short. If Chinese typos are severe, reduce annotations or regenerate. If it looks too much like a PPT, flowchart, or course page, change it.

What to do when things go wrong: Common problems and image fix prompts.

Xiaohei is only decoration, not the action subject. This kind of image looks complete but lacks a core action. Fix with:

Use $ian-xiaohei-illustrations This image direction is correct, but Xiaohei looks a bit like decoration.
Please keep the core meaning unchanged and regenerate: make Xiaohei the person truly driving the structure.
Make the image a bit more quirky, but still pure white, clean, and low text.

Image looks too much like a PPT. It has titles, borders, neat grids, and too many arrows. Fix with:

This image looks too much like a PPT. Please remove titles, borders, neat grids, and excessive arrows. Change it to a hand-drawn scene on plain white paper.

Another issue is too many Chinese annotations or typos. For this, it’s faster to use an image editor to fix directly rather than asking AI to regenerate.

Cover titles need to be accurate; post-generation layout is more reliable than image generation.

In practice, I found that covers often need a title-enhanced version. Instead of regenerating the entire image, keep the original, cover the original title area with white, and then re-lay the main title and subtitle.

Image generation models are good at metaphors and atmosphere. Accurate titles are better handled with post-layout. Image models can create atmosphere and metaphors, but having them stably write correct Chinese titles is still less reliable than precise typography on the image later.

If you need to combine images, you can ask Codex to concatenate two images vertically. But I recommend keeping each 16:9 image intact rather than forcefully combining them into one large image, because body illustrations are better suited for single-image reading.

Generation is just the first step. Cover titles, image splicing, typos, excessive text, and “PPT-ness” all require post-processing.

It doesn’t replace your aesthetics; it helps you avoid re-explaining.

Finally, who is this skill for, and who is it not for?

Suitable for:

People who write long-form Chinese articles, WeChat public accounts, blogs, Notion documents, AI workflow retrospectives
People who want to turn abstract judgments into concrete metaphors
People who don’t want to re-explain aesthetics and steps every time

Not suitable for:

Professional designers who need commercial-grade key visuals, brand illustrations, or traditional PPT infographics
People who need complex architecture diagrams or editable vector source files
People who need children’s cartoons, emoji sets, or multi-character complex scenes

A skill saves the cost of repetitive communication. In the end, whether to use an image in an article still requires human judgment. It can lower the barrier to adding illustrations, but it cannot replace the final aesthetic decision. You still need to know which part of the article is worth illustrating, which image is most stable, and which image needs fixing.

But for beginners, this is already enough. Illustrations go from “temporary luck” to “reusable workflow.” What’s truly worth consolidating is not a specific image, but the process of “install → invoke → generate → quality check → fix.”

The smallest first step: install once, test the invocation, create a shot list for a short article. See if it can help you turn illustrations from “relying on luck” into “editorial judgment.”

If it works, try generation and quality checking.

Reusable Prompt Templates

Below are prompts that have worked for me. You can copy and modify them directly.

Test after installation:

Use $ian-xiaohei-illustrations Don't generate images yet. Please tell me how you would approach creating illustrations for a Chinese article.

Only create illustration plan:

Use $ian-xiaohei-illustrations Don't generate images yet.
Please analyze the following article and identify where illustrations would be worthwhile. Output a shot list of around 5 images.
For each image, clearly state: which paragraph it follows, theme, core meaning, structural type, what Xiaohei is doing, and suggested Chinese annotation words.

<paste article>

Generate images for the entire article directly:

Use $ian-xiaohei-illustrations Generate 4 Xiaohei quirky body illustrations for the following article.
Requirements: 16:9 landscape, pure white background, black hand-drawn line art, minimal red/orange/blue Chinese handwritten annotations.
Each image should focus on only one core structure. Don't make PPT infographics or cute cartoons.

<paste article>

Common image fix prompts:

Use $ian-xiaohei-illustrations This image direction is correct, but Xiaohei looks a bit like decoration.
Please keep the core meaning unchanged and regenerate: make Xiaohei the person truly driving the structure.
Make the image a bit more quirky, but still pure white, clean, and low text.

This image looks too much like a PPT. Please remove titles, borders, neat grids, and excessive arrows. Change it to a hand-drawn scene on plain white paper.

Quality Checklist

Is it 16:9 landscape on white background?
Is Xiaohei the subject of the action?
Does this image communicate only one core idea?
Are the Chinese annotations short, few, and readable?
Does it look like a PPT / flowchart / course page?
Are there any unnecessary titles, typos, or hallucinated labels?
Is it just replicating example images rather than thinking of new metaphors for the current article?