Inside image generation’s Renaissance moment — the OpenAI Podcast Ep. 19

YouTube AI Channels Models

image-generation openai text-to-image multilingual 360-panorama product-update podcast

Summary

OpenAI researcher Kenji Hata and product lead Adele Li detail the major upgrades to ImageGen 2.0 on the podcast, including emergent capabilities such as text rendering, multilingual support, photorealistic quality, and 360-degree panoramas. Usage grew over 50% within two weeks of launch, with over 1.5 billion images generated weekly on ChatGPT.

No content available

Original Article

View Cached Full Text

Cached at: 05/14/26, 06:40 PM

**TL;DR:** OpenAI researcher Kenji Hata and product lead Adele Li detail ImageGen 2.0 in a podcast: a comprehensive leap in text rendering, multilingual support, and photorealism, plus how users are creating everything from panoramic walkthroughs to viral "MS Paint" style content. ## From DALL-E to ImageGen 2.0: A Renaissance Leap If DALL-E was the Stone Age, ImageGen 2.0 is the Renaissance. It’s not just artistically superior—it fuses science, art, architecture, and more into a single image. After internal review, the team confirmed it’s significantly better than ImageGen 1. Two weeks post-launch, usage grew over 50%, with more than 1.5 billion images generated weekly on ChatGPT. ## Product & Research Background ### Adele Li: From Investment to Product Adele joined OpenAI over two years ago, having previously worked in private equity and at Redpoint Ventures for three years, investing in AI and software companies. She started on data and compute infrastructure before moving to product, focusing on ImageGen for the past six months. She sees product management as doing what needs to be done—and ImageGen let her collaborate with researchers to identify market gaps and opportunities. The market today looks completely different from when ImageGen 1.0 launched a year ago: multiple image generation tools exist, and ChatGPT itself has evolved. ### Kenji Hata: From Audio to Image Kenji joined OpenAI about two years ago, initially working on audio projects. He gradually contributed to ImageGen 1.0 pre-launch work and eventually went full-time into image generation. He notes that during internal evaluations, early checkpoint samples compared to ImageGen 1 showed an enormous leap in photorealism—shifting from the glossy, idealized magazine-cover style to images that truly look like great photographs. ## Step-Change Improvements in Model Capabilities ### Text Rendering & Multilingual Support ImageGen 2.0 improves across multiple dimensions: - **Text Rendering**: Fidelity of on-screen text is dramatically better; words are meaningful and correctly spelled. - **Multilingual Support**: Dedicated effort to support many languages, with strong reception from Asian and European users. - **Photorealism**: Addressing feedback that previous models didn’t look real enough or altered faces/bodies, the goal was to make images feel more like the users themselves. These capabilities come from the model absorbing world knowledge and being able to reflect it back visually to users. ### Variable Binding & Object Counting From DALL-E 3 to GPT Image 1, the number of random objects in a grid jumped from ~5-8 to ~16; Image 1.5 consistently hit 25-36; ImageGen 2.0 can easily surpass 100. An internal standard test: ask GPT to list 100 random objects, pass them to the image generator—it gets nearly all of them right. ### Emergent Capabilities: 360° Panoramas The model can render images at any aspect ratio, leading people to create extremely long, stunning panoramic views and slender bookmarks. With 360° style rendering, users can explore these images in a 360° world. This feature is integrated into ChatGPT web and mobile versions. ## User Use Cases & Viral Trends ### Productivity & Creativity Side by Side Image generation was once seen as purely entertainment or non-productive, but now real productivity gains are visible—infographics, greatly improved text quality, and more productive use cases. People use the model to make fun memes, images for five-year-olds, professional consulting presentations, and to turn popular photos into rough MS Paint versions. Creating imperfect things actually requires high intelligence—users value authenticity, imperfection, and nostalgia. ### New Forms of Self-Expression Self-expression through AI is an area the team is very excited about. The model’s understanding of aesthetic beauty shines across outputs, greatly expanding the range of possible outputs—many use cases exceeded the team’s expectations. ## Model Efficiency & Post-Training ### Speed & Token Efficiency From the DALL-E era (“tell us what you want and check back in an hour”) to real-time generation in ChatGPT, the team has learned with each release how to produce great images with fewer tokens. The post-training process considers not only world knowledge, scientific concepts, and math, but also what kind of taste resonates with users and how to make outputs beautiful and realistic. ### Kenji’s Personal Benchmark Kenji often uses the “grid test”: generate a grid of 100 random objects—almost all correct. He also recalls asking early models (Ada, Babbage, Curie) to list 100 sci-fi books; some started repeating at book 22, helping measure capability limits. ### Adele’s Personal Evaluation Adele has her own “me-me-me” evaluation: 100 photos of herself, friends, and family, placing each person in a funny pose—she makes cards or birthday images for nearly everyone. She finds this a great test because she knows faces best, and it also checks whether ChatGPT understands context: does it remember the user has siblings, parents, their preferences, and personalize the image accordingly? --- **Source:** https://www.youtube.com/watch?v=bH2nP-aCFjk

Inside image generation’s Renaissance moment — the OpenAI Podcast Ep. 19

Similar Articles

@OpenAI: People are generating over 1.5 billion images a week in ChatGPT. Researcher @kenjihata joins Product lead @adele__li an…

This is ChatGPT Images 2.0

@OpenAI: What makes ChatGPT Images 2.0 a state-of-the-art image generation model? Researchers behind the model explain. A thread…

@OpenAI: Made with ChatGPT Images 2.0

GPT-Image-2 is rolling out

Submit Feedback

Similar Articles

@OpenAI: People are generating over 1.5 billion images a week in ChatGPT. Researcher @kenjihata joins Product lead @adele__li an…

@OpenAI: What makes ChatGPT Images 2.0 a state-of-the-art image generation model? Researchers behind the model explain. A thread…

@OpenAI: Made with ChatGPT Images 2.0