DALL.E 3 is finally retiring, after OpenAI's recent move of integrating image generation directly into ChatGPT, allowing users to create visuals without leaving the chat interface.
The company announced the new update on Tuesday, explaining how this move aligns with the company's broader goal of making AI tools more accessible and versatile across different media, reinforcing its presence in the AI art space.
The new update will build on DALL.E 3's image generation model. But since its launch in 2023, the AI model had somehow struggled to maintain popularity among AI enthusiasts, who favored more advanced alternatives like Flux, MidJourney v6, SD 3.5., Redraft, and Reve.
Previously, OpenAI kept image and text generation separate, with GPT handling text-based tasks while DALL·E 3 focused on images. But with the new GPT-4o, everything is consolidated into a single system, effectively retiring DALL·E 3.
A Smarter and More Capable Model
"GPT‑4o image generation excels at rendering text accurately, following prompts precisely, and utilizing its built-in knowledge and chat context—including transforming or drawing inspiration from uploaded images," OpenAI stated in a blog post.
This marks another step toward OpenAI’s vision of GPT-4o of becomign an“omni” model, capable of handling multiple modalities—including text, images, and audio—within a unified framework. According to the company, GPT-4o is significantly more capable, accurate, and intelligent than its predecessors.
During the reveal, OpenAI CEO Sam Altman was showcasing ChatGPT-4o's new abilities, saying
"We know you've been waiting, but we think it's worth it. It's such a huge step forward that the best way to explain it is just to show it."
In the demonstration, OpenAI highlighted several use cases, including manga pages explaining the theory of relativity with inputs in English and Mandarin; custom trading cards generated from personal and real photos; commemorative coins merging multiple images with transparent backgrounds and highly detailed illustrations created from extraordinarily long prompts.
During the reveal, Altman also was transparent with some of the issues of this new image generation model, one of which being the speed at which it generates its images. Altman states that although GPT-4o’s appears to be slower in generating the images, but that is becuase it focuses more on the quality of the images instead of the efficiency of getting the image generate.
Nascent stage of development
But what we are seeing now is just the first phase of the release, as new features will be rolling out progressively.
Comparing the DALL.E 3 model side by side with the new ChatGPT model, we can also see stark differences: while the DALL.E 3 images pop up fully formed after a long loading screen, the new GPT-4o renders images progressively from top to bottom in real time.
But the OpenAI team stresses that its more than just pretty images. The most advanced part about the new GPT-4o is it is able to visualise what they know and translate that information into visual images.
This capability would be especially useful when applied in the educational sphere, like through scientific diagrams or informational posters with accurately rendered text and even image editing with subject consistency.
Built-In Safeguards and Future Expansion
But with all the new capabilities and things the AI can do, OpenAI has remembered to implement guardrails to prevent misuse such as deepfakes, and illegal content.
While generated images won’t feature visible watermarks, they will contain C2PA metadata to indicate their AI origin. OpenAI is also developing tools to track image provenance.
The company plans to extend the feature to its API, allowing developers to integrate image generation into their own applications. Additionally, OpenAI’s Terms of Use confirm that users will retain ownership of their generated images, subject to the platform’s policies.