You’ve no doubt noticed the plethora of AI art generators that have sprung up in the last year or so: super-smart engines that can produce pictures that look just like real photographs, or artwork created by real humans. As time goes by, they’re getting increasingly powerful, and adding more and more features—you can even find an AI art tool in Microsoft Paint now.
New to the DALL-E AI image model, available to ChatGPT Plus members who are paying $20 a month, is the ability to edit parts of an image, just like you might do in Photoshop: No longer do you need to regenerate an entirely new picture just because you want to change one element of it—you can show DALL-E the part of the image you want to adjust, give it some new instructions, and leave everything else alone.
It overcomes one of the important limitations of AI art, which is that each image (and video) is something completely unique and different, even when you’re using identical prompts. That makes it hard to achieve consistency across images, or fine tune an idea. However, these AI art creators, based on what are known as diffusion models, still have a lot of limitations to overcome—as we’ll show you here.
Editing images in ChatGPT
If you’re a ChatGPT Plus subscriber, you can load up the app on the web or mobile and ask for a picture of anything you like: a cartoon dog detective solving a case in a cyberpunk setting, a rolling landscape of hills with a lonely figure in the mid-distance and storm clouds gathering overhead, or whatever it is. After a few seconds, you get your picture.
To edit the picture, you can now click on the generated image, and then on the Select button in the top right corner (it looks like a pen scribbling a line). You then adjust the size of your selection tool using the slider in the top left corner, and draw over the part of the image you’d like to change.
The editing interface in ChatGPT
Credit: Lifehacker
This is where this is a significant step forward: You can leave part of the image untouched, and just refresh a selection. Previously, if you sent a follow-up prompt asking for one particular part of a picture to be altered, the entire image would be regenerated, and quite probably look significantly different to the original.
When you’ve made your selection, you’ll be prompted to enter your new instructions, just for the highlighted section of the picture. As usual with these AI art tools, the more specific you are, the better: You might ask for a person to look happier (or less happy), or for a building to be colored differently. Your requested changes are then applied.
Success! ChatGPT and DALL-E change out one dog for another.
Credit: Lifehacker / DALL-E
Based on my experiments, ChatGPT and DALL-E seem to deploy the same kind of AI trickery we’ve seen with apps like Google’s Magic Eraser: Intelligently filling in backgrounds based on the existing information in a scene, while trying to leave everything outside the selection untouched.
It’s not the most advanced selection tool, and I did notice inconsistencies in borders and object edges—which is perhaps to be expected, considering how much control you get when it comes to selecting. A lot of the time the editing feature worked well enough, though it’s by no means reliable every time, which is no doubt something OpenAI will be looking to improve in the future.
Where AI art hits its limits
I tried the new editing tool to do a variety of tricks. It did well at changing the color and position of a dog in a meadow, but less well at reducing the size of a giant man standing on the ramparts of a castle—the man just disappeared in a blur of rampart pieces, suggesting the AI was trying to paint around him without much success.
In a cyberpunk setting I asked for a car to be dropped in, and no car appeared. In another castle scene, I requested that a flying dragon be turned around so it was facing the other way, be turned from green to red, and have flames added coming out of its mouth. After a few moments processing, ChatGPT removed the dragon altogether.
Fail! ChatGPT and DALL-E erased the dragon instead of altering it.
Credit: Lifehacker / DALL-E
This feature is still brand new, and OpenAI isn’t claiming it can replace human image editing just yet—because it clearly can’t. It will improve, but these mistakes help to show where the challenges lie in terms of certain types of AI-generated art.
What DALL-E and models like it are very good at is knowing how to arrange pixels to give a good approximation of a castle (for example), based on the millions (?) of castles they’ve been trained on. However, AI doesn’t know what a castle is: It doesn’t understand geometry, or physical space, which is why my castles have turrets poking up out of nowhere. You’ll notice this in a lot of AI-generated art involving buildings, or furniture, or any objects that aren’t quite rendered properly.
It’s pretty white but it’s far from “plain”.
Credit: Lifehacker / DALL-E
At their core, these models are probability machines that don’t understand (yet) what they’re actually showing: It’s why in a lot of OpenAI Sora videos, people vanish into nowhere, because the AI is very cleverly arranging pixels, not tracking people. You might have also read about AI struggling to create pictures of couples of different races, because couples of the same race are more likely, based on the image training data.
Another quirk that’s recently been noticed is the inability of these AI art generators to create plain white backgrounds. These are incredibly smart tools in many ways, but they’re not “thinking” in the same way you or I would think, and not understanding what they’re doing in the same way a human artist would—and it’s important to bear that in mind as you use them.