Cached at:
04/21/26, 04:43 PM
**TL;DR** Ernie Image, a new open-source model, beats the previous champ Zage in text rendering, prompt fidelity and realism; here’s a ComfyUI setup guide to run it free and unlimited on your own GPU.
## Ernie Image: The New Open-Source King?
Ernie Image is a freshly-released diffusion model that tops public leaderboards. It nails dense prompts, legible text and a wide range of styles—comics, photo-real shots, posters, infographics, abstract art—without the plastic look early Flux versions had.
## Head-to-Head: Ernie vs Zage (Former #1)
All prompts are shown on screen; below are the decisive rounds.
| Task | Winner | Why |
|---|---|---|
| 1998 retro photo of painter photographing screen recursive selfie | Ernie | better film grain, recursive concept |
| Kyoto desktop diorama with Kinkaku-ji, torii gates, kimono walkers | Ernie | gates line up, human scale consistent |
| Ballet studio + rabbit + elephant outside window | Ernie | reflections, props, text all accurate |
| Long diary paragraph | Ernie | 1 missing word, 1 typo; Zage hallucinates lines |
| Bakery-window multi-element poster | Ernie | text repeats but looks real; Zage plastic |
| Holiday cookie-swap poster | Zage | sponsor logo & cookie pile more complete |
| Dark-mode UI infographic | Ernie | every icon/label correct; Zage garbled |
| B&W comic page with panels | Ernie | panel order, speech bubbles, reading flow perfect |
| Taj Mahal half-photo half-sketch | tie | Ernie label readable, Zage framing better |
| Mirror pixel-art reflection | Zage | only reflection pixelated; Ernie blurs whole person |
| Manet impressionism | tie | both too sharp, weak brush feel |
| Minimal ink-wash tiger | tie | both catch negative space |
| Stippled flat design | tie | both build image with dot size |
| Anatomy stress-test (yoga + explosion) | Zage | Ernie twists limbs, Zage hits pigeon pose |
| Palm + sole shot | tie | both deliver; Ernie tub pose floaty |
| 11:15 clock + full wine glass | all fail | even closed-source giants can’t count |
Scoreboard: Ernie wins 7, Zage 2, ties 6.
## Official Benchmarks
On the open-source chart Ernie Image now sits first overall, ahead of Zage, Quen and Flux2-Klein, and within striking distance of the closed-source leader Nano-Banana-2.
- **Ernie-Image-Base**: highest quality, slower, 3–5× steps
- **Ernie-Image-Turbo**: almost identical look, real-time speed; recommended for daily use
(“PE” in the table means the built-in prompt enhancer is on.)
## Local Install: Free, Unlimited, Offline
### Hardware
Single model ≈ 16 GB; with text encoder + VAE you need ~20 GB VRAM. Quantized versions coming that drop the bar to 8 GB.
### One-Shot Setup (Windows / macOS / Linux)
1. Install ComfyUI – Ernie nodes already ship with the latest build
2. Download
- [Ernie-Image-Turbo.safetensors](https://huggingface.co/ErnieImage/turbo)
- text encoder & VAE from the same repo
3. Drop everything into `ComfyUI/models/Ernie/`
4. Launch ComfyUI, pick “Ernie Turbo” node, type Chinese or English prompt, hit Generate
No Internet required after download. A 1024×1024 image uses 6–8 GB VRAM; an RTX 3060 12 GB handles it comfortably.
## Roadmap
- Ernie Image Editor (inpaint / outpaint) landing soon
- 8-bit and 4-bit quants in testing; 8 GB cards supported within weeks
Source: [https://www.youtube.com/watch?v=A_nAU8h9YOY](https://www.youtube.com/watch?v=A_nAU8h9YOY)