Leading artificial intelligence chatbots are capable of generating more than just short stories, poetry and code. ChatGPT, Gemini and MetaAI also have image generation capabilities.
The chatbots themselves don’t actually create the images; instead, each acts as a middleman between the user and a different AI image model. This hasn’t always worked to plan though, as it led to Gemini generating racially biased images in a way the image model alone didn’t.
With MetaAI recently joining the chatbot ranks, I decided to create a series of prompts to see how well each of the AIs performs when it comes to creating a variety of different images and styles.
Claude 3 didn’t make the cut as, while it can analyze an image, it can’t yet generate one, and I left Microsoft Copilot off as it uses the same underlying DALL-E 3 model as ChatGPT.
Creating the prompts for the image test
Throughout this experiment, I’ve left everything default and added no instructions beyond the concept and style I want the AI to generate.
I instructed ChatGPT to make them square since it has no default, whereas MetaAI and Gemini only generate square format images.
1. A surrealist moment
(Image credit: ChatGPT, Gemini, MetaAI)
The first prompt tests the AI’s ability to follow a complex prompt with a range of instructions including coloring, style and focus.
The prompt: “A surreal landscape featuring a floating island with a mysterious ancient temple, populated by bioluminescent plants and ethereal creatures, rendered in a vibrant and dreamlike art style.”
Gemini failed to capture the bioluminescent plants, but it did create a better floating island and temple. I’m giving this one to MetaAI as I think it’s the best all-rounder.
2. An old wizard
(Image credit: ChatGPT, Gemini, MetaAI)
Next up is the only real person of the set. The aim is to show an old face with signs of immense knowledge and power behind his eyes.
The prompt: “A highly detailed close-up portrait of a wise old wizard with an intricate, braided beard adorned with magical trinkets, captured in a realistic style reminiscent of Renaissance paintings.”
First, let’s address the blank square in the room. Google Gemini flat-out refused to generate this image as it featured a person — even a fictional person. The MetaAI and ChatGPT images were both incredible, but ChatGPT edges out Meta by a hair.
3. Cyberpunk ninja
(Image credit: ChatGPT, Gemini, MetaAI)
How well can each generator depict a motorcycle in motion, heavily stylized and capture the concept of a rain-soaked cityscape? Very well.
The prompt: “A dynamic action scene depicting a cyberpunk ninja engaged in a high-speed chase on a futuristic hoverbike through a neon-lit, rain-soaked cityscape, illustrated in a gritty comic book style.”
Again, I’ve opted to give this one to ChatGPT as I think it captures the rain concept better than the other two. MetaAI didn’t generate a hoverbike and Gemini was a bit too mushy.
4. Cute baby elephant
(Image credit: ChatGPT, Gemini, MetaAI)
This prompt tested the ability of the AI chatbot to capture the concept of cute and do so in a way that follows the style prompt — in this case Pixar-style.
The prompt: “An adorable and expressive baby elephant playing with a colorful ball in a lush, tropical garden, rendered in a charming Pixar-like 3D animation style.”
They all did a good job but I took points away from ChatGPT for the border. In the end, they were all remarkable, but I think Gemini was the closest to the prompt.
5. Nature and technology
(Image credit: ChatGPT, Gemini, MetaAI)
I love seeing how well, or whether, AI chatbots can handle a more abstract concept — in this case generating something thought-provoking.
The prompt: “A thought-provoking conceptual image symbolizing the struggle between nature and technology, featuring a robotic hand delicately holding a fragile, blooming flower amidst a desolate, post-apocalyptic landscape.”
All three AI image generators created something similar, but MetaAI was by far my favorite as it merged the concept of both power and softness perfectly.
6. A simple still life
(Image credit: ChatGPT, Gemini, MetaAI)
It’s always fun seeing how different AI image generators manage when it comes to depicting glass. Here, the glass was housing sparkling wine with a mixture of fruits, meats and other elements.
The prompt: “A mouthwatering still life composition showcasing an artistically arranged assortment of exotic fruits, gourmet cheeses, and a glass of sparkling wine, captured in a photorealistic style with dramatic lighting.”
All three created an image on a similar theme. They all followed the prompt but I found ChatGPT over cluttered and MetaAI too sharp so I gave it to Gemini.
7. Heading to space
(Image credit: ChatGPT, Gemini, MetaAI)
Finally, we head to space and the concept of a massive space station. It had to do more than that though, it had to show both stars and a nebula that was part sci-fi and part factual.
The prompt: “An awe-inspiring astronomical scene depicting a colossal, ancient space station orbiting a luminous binary star system, with a vibrant nebula and countless stars in the background, rendered in a style that blends science fiction and realism.”
I’m not sure what MetaAI thought it was doing here — it seemed too off on a weird tangent. I had to give it to ChatGPT as it was the only one to show two stars.
Was there a winner?
Swipe to scroll horizontally
Challenge | ChatGPT | Gemini | MetaAI |
---|---|---|---|
A surrealist moment | Row 0 – Cell 1 | Row 0 – Cell 2 | ✅ |
An old wizard | ✅ | Row 1 – Cell 2 | Row 1 – Cell 3 |
Cyberpunk ninja | ✅ | Row 2 – Cell 2 | Row 2 – Cell 3 |
Cute baby elephant | Row 3 – Cell 1 | ✅ | Row 3 – Cell 3 |
Nature and technology | Row 4 – Cell 1 | Row 4 – Cell 2 | ✅ |
A simple still life | Row 5 – Cell 1 | ✅ | Row 5 – Cell 3 |
Heading to space | ✅ | Row 6 – Cell 2 | Row 6 – Cell 3 |
Total | 3 | 2 | 2 |
This was much closer than I expected. Each AI chatbot was able to create a series of compelling images — with the exception of Gemini and people.
There were some major style differences between them and in all cases it came down to personal taste over any other element such as prompt following.
In the end, I think ChatGPT just edged out the other two. It also has a wider feature set including generating a range of image orientations and canvas sizes, editing of images and other functions. That said, Meta can animate an image.