Point-E: A system for generating 3D point clouds from complex prompts

OpenAI Blog 12/16/22, 08:00 AM Models

Summary

OpenAI introduces Point-E, a system for generating 3D point clouds from text prompts in 1-2 minutes on a single GPU by combining text-to-image and image-to-3D diffusion models. The method achieves significant speedup over prior methods while releasing pre-trained models and code.

No content available

Original Article

View Cached Full Text

Cached at: 04/20/26, 02:46 PM

# Point-E: A system for generating 3D point clouds from complex prompts Source: [https://openai.com/index/point-e/](https://openai.com/index/point-e/) While recent work on text\-conditional 3D object generation has shown promising results, the state\-of\-the\-art methods typically require multiple GPU\-hours to produce a single sample\. This is in stark contrast to state\-of\-the\-art generative image models, which produce samples in a number of seconds or minutes\. In this paper, we explore an alternative method for 3D object generation which produces 3D models in only 1\-2 minutes on a single GPU\. Our method first generates a single synthetic view using a text\-to\-image diffusion model, and then produces a 3D point cloud using a second diffusion model which conditions on the generated image\. While our method still falls short of the state\-of\-the\-art in terms of sample quality, it is one to two orders of magnitude faster to sample from, offering a practical trade\-off for some use cases\. We release our pre\-trained point cloud diffusion models, as well as evaluation code and models, at[this https URL⁠\(opens in a new window\)](https://github.com/openai/point-e)\.

Point-E: A system for generating 3D point clouds from complex prompts

Similar Articles

@EHuanglu: AI video has reached pixar quality you can now generate 1 min 3d animation with one prompt

Breaking the Transformer Dead-End: A Local-First 3D Point-Cloud Cognition Engine running on consumer hardware

DALL·E 3 is now available in ChatGPT Plus and Enterprise

EVA01: Unified Native 3D Understanding and Generation via Mixture-of-Transformers

@itsPaulAi: Woow Nvidia has just released a 2.6B open-source world model You can turn a single image, text prompt and trajectory in…

Submit Feedback

Similar Articles

@EHuanglu: AI video has reached pixar quality you can now generate 1 min 3d animation with one prompt

Breaking the Transformer Dead-End: A Local-First 3D Point-Cloud Cognition Engine running on consumer hardware

DALL·E 3 is now available in ChatGPT Plus and Enterprise

EVA01: Unified Native 3D Understanding and Generation via Mixture-of-Transformers

@itsPaulAi: Woow Nvidia has just released a 2.6B open-source world model You can turn a single image, text prompt and trajectory in…