@FinanceYF5: "Gossip Goblin" is arguably the world's top AI movie creator. His new film "THE PATCHWRIGHT" is a masterpiece (with over 10 million views). But no one has ever known how these works were actually made. Step-by-step workflow

X AI KOLs Following News

Summary

Reports that AI movie creator Gossip Goblin's new work "THE PATCHWRIGHT" has surpassed 10 million views, and its workflow is about to be revealed.

"Gossip Goblin" is arguably the world's top AI movie creator. His new film "THE PATCHWRIGHT" is a masterpiece (with over 10 million views). But no one has ever known how these works were actually made. Step-by-step workflow 🧵👇 https://t.co/c2wNUZV9Qp
Original Article
View Cached Full Text

Cached at: 05/16/26, 07:24 AM

《Gossip Goblin》is arguably the world’s top AI filmmaker.

His new film THE PATCHWRIGHT is a masterpiece (over 10 million views).

But no one has ever known how these works are actually made.


Step-by-step workflow

2/ This 20-minute film took him a full 4 months.

And The Patchwright wasn’t built from scratch.

Over the past 11 months, Zach has been continuously developing content within this universe—accumulating tens of thousands of Midjourney images, fully fleshed-out characters, and even a constructed language.

Many focus on the tools.

But what’s truly remarkable is the long-term worldbuilding and narrative system he’s built.

3/ The visual style follows the story arc.

The film descends from a penthouse down to the streets, so they split scenes by height: penthouse, airships, sky city, streets, market.

Before using Kling for animation, each scene first undergoes a round of Midjourney style exploration.

Then, using Nano Banana, they insert consistent characters into those “empty shots.”

4/ Don’t settle for AI’s default output.

The interrogation room could have been a generic “gray walls + table” template.

But they turned it into a “ring-shaped surveillance space” with an overlooking top structure.

Then they used Nano Banana to blend in reference images of Brutalist seating to add detail.

Because the default output AI gives you is also what the next thousand people will get.

5/ In this kind of creation, Midjourney v7 is still stronger than 8.1.

Zach believes 8.1 is “over-normalized,” while v7 gets closer to Midjourney’s true ceiling.

But the real key isn’t prompt engineering.

It’s the long-term refinement of a personal profile code that matches your aesthetic, combined with mood boards and style references.

Truly distinctive creators almost all work this way.

6/ Midjourney handles visual exploration; Nano Banana is more like Photoshop.

Zach’s words: “No image starts in Nano Banana—it’s just our Photoshop.”

Midjourney can achieve superior color, texture, and unique tactile quality.

If you start directly in Nano Banana, you often just get its default visual style—faster, but more generic.

7/ Lock in the “hero shots” first, then fill in the in-between frames.

For each scene, they first create a few hero shots that define the lighting and atmosphere.

Once those are set, they use Nano Banana to generate the remaining shots.

For example, the lighting change from the penthouse to the street wasn’t designed shot-by-shot—it naturally extended from a few key frames.

8/ The color and atmosphere are essentially finalized before animation begins.

They designed the wet market as a hybrid of “Hong Kong × Bangkok × Kowloon night market”: humid, chaotic, foggy.

Then they built a dedicated color and lighting reference around that concept.

Ultimately, nearly 90% of the color grading was already locked in the static images.

9/ The real masterstroke is that they are “building a world.”

They deliberately avoided common cyberpunk tropes like Japanese kanji.

The team even designed a brand-new alphabet system, with auxiliary symbols inspired by Burmese script.

The film’s opening title morphs from alien script into their self-created language.

10/ Don’t leave any detail to “default generation.”

The teapot, experimental apparatus, formaldehyde-preserved biological specimens, and even a fleeting studio texture in The Patchwright were all custom-designed.

Though most of these details never made it into the final cut,

it’s precisely these invisible investments that make the world feel “real” instead of randomly AI-generated.

11/ The character work is almost entirely frame-by-frame.

The entire The Patchwright doesn’t use automatic multi-shot or omni-ref features.

Because Zach wants complete control over everything in the frame.

This hardest path may have cost them extra months.

But it’s also why this film has a truly unique style.

12/ Character consistency doesn’t rely on a single “reference sheet.”

They separately generate close-ups, full-body shots, and back views of each character.

Then they recombine them shot-by-shot using Nano Banana.

For instance, “a character walks through the market from behind” is actually: a back-view asset + market environment + other characters + prompt assembled live in Nano Banana.

Each shot is rebuilt from scratch, not templated.

13/ Before Nano Banana Pro existed, they even did “manual retouching.”

During that 9-minute interrogation scene, the team would draw arrows, lines, and motion notes directly on images.

Then they’d feed those edited images back into Nano Banana to generate the next frame.

Hundreds of shots, almost all done one by one, the hard way.

14/ Every clearly audible line of dialogue in The Patchwright is voice-acted by real humans.

The protagonist is voiced by a former American opera singer and DJ—the same person from the original short.

The female roles were cast through a UK casting website; many minor characters were voiced by friends as guest performers.

ElevenLabs is convenient.

But when the budget allows, the emotion and texture of real human performance are still irreplaceable by AI voice.

15/ Animation in The Patchwright is almost entirely done with Kling.

During this 4-month intensive production, no other model matched Kling’s level of detail, stability, and control.

Some speaking shots were further enhanced with Sync for lip-sync.

But Sync Labs is actually quite cumbersome—you often have to manually edit the video and audio to align the waveform as closely as possible, then let it fill in a few mouth movements.

16/ “AI is like wet clay.”

Zach says AI doesn’t run exactly according to your intentions.

You have to understand it, guide it—not absolutely control it.

True creation is more about feeding the model the right inputs and embracing the “happy accidents.”

17/ The most underrated role in AI filmmaking is the editor.

The director handles the script, storyboard, and shots.

But the real sense of unity in an AI film is often achieved during editing.

An editor who understands AI’s flaws is incredibly valuable.

18/ The core of color grading is “90% done in pre-production, the last 10% in post.”

Most of the color tone is already decided at the static image stage.

Post-production is mostly about unifying differences between shots.

And Zach emphasizes one thing: never use super-resolution upscaling.

Because every current upscaler destroys texture and original aesthetic.

19/ Add grain.

In the past, grain was used to mask AI artifacts.

Now, it’s because AI images are too clean, too waxy.

Grain brings back a rough, tactile feel, making the image look more like the real world than “rendered.”

20/ Sound is what truly binds everything together.

Zach hires real musicians to compose the score for key sections.

Then layers in ambient sounds, mechanical noise, market chatter, and footsteps.

AI music is good enough.

But what truly turns an AI clip into a “film” is the combination of human-composed music + AI music + custom sound effects.

21/ The final point.

Today, almost everyone uses the same set of AI tools.

What truly sets people apart is never the model.

It’s whether you’re willing to carefully polish the details that “no one will notice.”

A sign, a cup, a biological specimen in the corner.

99% of people skip them.

But because someone took the time to do them right, it feels like a real world.


PJ’s Deep Dive:

That’s all, original by @PJaccetturo

If you enjoyed this topic:

  1. Follow me (@FinanceYF5)
  2. Like + Retweet the first post below

Similar Articles

@FinanceYF5: The question Titanic fans have debated for nearly 30 years has been 'solved' by AI in 40 seconds. It's not editing, not a parody — someone inserted themselves into the original scene and directly changed the plot. The truly scary thing about AI video: it's starting to turn classic movies into interactive material.

X AI KOLs Following

AI video technology lets users insert themselves into Titanic scenes and change the plot, solving a long-debated fan question, showcasing AI's ability to turn classic movies into interactive material.