The robot apocalypse draws closer! In 2015, a new milestone was achieved: an artificial intelligence creating its own art. Weird, nightmarish, surreal art full of eyeballs and psychedelic patterns. Google’s Deep Dream software was originally invented to visualize the inner workings of a Convolutional Neural Network, and scientists soon discovered that by tweaking a few equations they could make the algorithm create and modify images instead of just classifying them.
The code for Deep Dream is now public, and several web apps exist that let you try it out on your own pictures. Two popular sites are Dreamscope and Deep Dream Generator. Unfortunately, accessing the tools on either platform requires you to create a free account, but once logged in you can upload photos, choose your filter, and marvel at the results!
Remember: don’t upload photos of yourself to the internet. Use pictures of pets, landscapes, or images found on Google.
THE MATH BEHIND THE MAGIC
How can you tell that the music on the radio is a rock song or a piece of classical music? The type of instruments, the speed of the song, and the patterns in the melody are all clues that your brain uses to differentiate between musical genres. Likewise, if you’re trying to recognize a piece of fruit, you’ll pay attention to things like colour, size, shape, and weight.
In a sense, neural networks work the same way as your brain. Using a series of linked statistical equations, neural networks take input data, crunch the numbers, and decide if the spherical orange object weighing 100g is more likely a peach or an orange. To train a neural network, you show it the data from thousands of oranges and peaches, until it gets an intuitive sense ‘orange-ness’ and ‘peach-ness’.
Identifying an object inside a picture is extra hard because it’s not always clear what data you need to inspect. If you’re looking at a photo of a dog, the dog might fill the whole picture or it might be tucked away in a corner. You might see its full body or only its face, and the photo might be taken from the side or from above. So in order to classify the dog photo, we need a special variation of the neural network algorithm called a Convolutional Neural Network (CNN).
INSIDE THE CNN
You train a CNN by showing it hundreds of pictures of dogs until it gets an intuitive sense of ‘dog-ness’. While the algorithm still uses chains of statistical equations to make its decision, the process is a little less straightforward.
First, the CNN slices up its input image up into squares. It then applies a series of different convolutions to each square in order to extract features.
You can imagine a “convolution” as a type of filter. Only instead of making photos prettier we’re trying to make them simpler for computers to process. When applying a convolution, you adjust a pixel’s value based on the value of its neighbours. A blur is a simple type of convolution where colours and edges are blended together, which is useful because it obscures details and forces us to only consider essential shapes. There are many other types of convolutions, including sharpening and edge detection.
Informally, “features” are “interesting parts” of an image. Things like lines, dots, and corners help us recognize that we’re looking a drawing of a cat instead of a drawing of a dog:
In this case, the shape of the snout, the ears, and the tail are the features that differentiate between the two animals.
By cutting up an image and applying convolutions, CNNs transforms a complex photograph into a simpler series of features. Of course, the neural network has many layers, and each one breaks up the image into smaller squares and makes the features more abstract. In the final layers, it’s not obvious to humans what shapes the CNN is studying.
The training process helps the CNN “learn” what types of features are common in the photos it’s trying to identify. This also means that each CNN is only to trained to recognize a specific thing, or maybe differentiate between a handful of objects. A CNN looking for dogs will classify an apple as a “not-dog” because the network has no concept of apples, peaches, humans, music, grass, or anything other than dogs and “not-dogs”.
DEEP DREAM
Deep Dream is essentially a type of CNN that uses its input picture as a springboard into bizarre dreamscapes.
As the algorithm progresses through its layers of convolutions it encounters feedback loops. At this point, the CNN returns to an earlier step and strengthens the features that it considered “interesting”. If the algorithm has been trained to detect dogs, it’ll fixate on the “dog-like” features of the picture and make them progressively more and more “dog-like” until dogs actually start to appear in clouds and trees!
A CNN trained on eyeballs will make eyeballs sprout from walls and faces. Another neural network trained on impressionist art will transform its inputs into glorious impressionist tributes.
So is the algorithm really “creating?” While the results look alien, the process followed by Deep Dream isn’t all that mysterious: the algorithm is just identifying and strengthening features in a mathematical, formulaic way. There’s no judgment calls, no spark of inspiration. The neural network has no concept of “art” or anything beyond the shapes it’s trained to recognize. That doesn’t mean CNNs aren’t amazing, or that the creations of Deep Dream aren’t interesting and fun. But it does mean that perhaps the robot apocalypse isn’t so close after all.
Happy dreaming!
Learn More
Deep Dream Generator
https://deepdreamgenerator.com/
https://deepdreamgenerator.com/about
Different Kinds of Convolutional Filters
https://www.saama.com/blog/different-kinds-convolutional-filters/
What is Artificial Intelligence?
https://kidscodecs.com/what-is-artificial-intelligence/
Computerphile video about Google Deep Dream
https://www.youtube.com/watch?v=BsSmBPmPeYQ