As artificial intelligence (AI) improves, artists are finding themselves in unprecedented territory. Realistic images are being made in seconds; millions of them are created each day; and the images are being entered into and winning art competitions. But none of them are being made by humans.
We spoke to Ahmed Elgammal a professor of computer science at Rutgers University to find out about the rise of AI art and what it means for human creativity in the digital era.
How do AI image generators work?
Five years ago, there was an advancement in AI known as Generative Adversarial Networks (GANs). It took images and tried to generate similar results. Give it images of cats and it would return completely new versions to match. This was revolutionary and many artists started using it.
Then came a newer generation that used text to generate images to give more control over what was being generated. This worked by training the model on lots of images and their accompanying text caption to understand how the words relate to the images.
So in an image of a bird on a tree, the AI guesses where the tree and bird are and the network tells it if it’s correct. By doing that for billions of images, the AI figures out what words relate to what images.
Why do these AI models struggle with complicated shapes?
These AIs do struggle with small details. They’ll have a hard time generating something small because they’re trained to optimise through lost function, which is a criteria that encourages it to optimise a whole image, trying to get the majority of it correct.
The AI neglects the small details that we, as humans, are tuned to catch – a hand with four fingers or a three-legged person, for example. To the AI, there’s no difference between this and any other small details in the background.
How much energy does it take to make these sorts of images?
The training process for these models requires a lot of energy. You need to run them on graphical processing units (GPUs) for weeks while they process billions of images. And after that, you then need to rewrite them many times over in order to optimise the process.
But even after training the models, they need to be running constantly on a GPU with millions of processors and these are very energy consuming devices. Having them running for the lengths of time required has a significant energy consumption and environmental impacts.
Read more:
- We asked Google’s new AI music bot to write us a song. We instantly regretted it
- Why your thoughts may not be private for much longer
These models learn from the internet and suffer from the biases and misinformation they learn there. Should there be a designated list of information, or should the AIs have access to everything?
How can we control the data given to an AI? There are different opinions on everything from politics to religion, lifestyle and everything in between. We can’t censor the data it’s given to support certain voices.
The AI naturally must reflect all opinions and viewpoints in the world. This will come with a lot of misinformation but that is the world we live in. The same way we look at feeds in social media. You filter out or guess that this is false information, or this is true information.
Right now, AI has no way to tell fact from fiction, everything is just words and once we start talking about facts, that’s a big problem. What are the correct facts and opinions?
There aren’t many laws in this field, does this need to change, especially around copyright?
The copyright problem comes with the current generation of imagery tools that are mainly trained on billions of images. However, this wasn’t an issue a couple of years ago, when artists used to have to use AI through certain models that were trained using the artist’s own images.
The copyright issue comes with the use of millions of images taken from the internet without consent of the artist. The problem is, while it is unethical, it isn’t necessarily breaking any copyright laws. It’s making transformative versions of the image, not a direct derivative so under any copyright law this would not be a problem.
We’re going to have to remind everybody that this is not the way it’s supposed to be. You can use AI with your own images, without stealing other people’s work.
Can AI learn the creativity and emotion required to make great art?
The current generation of AI is limited to copying the work of humans. It must be controlled largely by people to create something useful. It’s a great tool but not something that can be creative itself.
We must be conscious about what’s happening in the world and have an opinion to create real art. The AIs simply don’t have this.
A couple of years ago, we used AI in a project to generate Beethoven’s 10th Symphony. We trained these AIs on lots of classical music and then they looked at the sketches that Beethoven left for the symphony to generate compilations of these notes.
That was a great example of AI as a tool but it uses no creativity. This is what is happening in the world right now with AI. It’s a creative process that’s mainly human and AI is following the rules to generate content for them.
Is AI just the latest art movement, similar to impressionism or modernism?
In the last five years we have seen this movement, but I think it has ended really. The early artists had specific aesthetics in their work that were uncanny and unhuman-like. It had a specific look and style, but I think now it’s all becoming more photorealistic.
Very good for realistic images but it’s lost this ability to be surprising and uncanny and have the surreal effects.
I think that era has gone. It’s more of a tool for everybody to generate a photo-realistic image, a graphic design and logo, not a unique piece of art.
About our expert, Ahmed Elgammal
Ahmed Elgammal is professor at the Department of Computer Science at Rutgers University, where he leads the Art and Artificial Intelligence Laboratory. His research investigates AI art generation and computational art history. Elgammal has published over 180 peer-reviewed papers in journals including Pattern Recognition and Computer Vision And Image Understanding.
Read more: