Harry Potter and the Predictive Keyboard

Almost ten years after the release of the seventh Harry Potter book, the magical story of a boy who discovered he was a wizard has finally been updated with a new chapter. This text, however, wasn’t written by J. K. Rowling, but by an artificial intelligence. It starts like this:

“The castle grounds snarled with a wave of magically magnified wind.” Not bad, right? Then it gets weird: “Ron was standing there and doing a kind of frenzied tap dance. He saw Harry and immediately began to eat Hermione’s family.”

The great minds behind “Harry Potter and the Portrait of What Looked Like a Large Pile of Ash” are from Botnik, an online community of writers and developers who combine art and tech to “create strange new things.” To write this text, the authors used a predictive keyboard trained using all seven Harry Potter novels. You can real the full masterpiece here.

The magic of statistics

When texting on a smartphone, you’ve probably seen “word suggestions” pop up above the tiny digital keyboard. If you clicked on the first suggestion, and kept clicking until you had a full sentence, you’d be following the same process as the writers at Botnik. Except while phones analyze a person’s text messages, Botnik’s algorithm studied a specific author.

Predictive keyboards are built using a mathematical system called Hidden Markov Models (HMMs). The term may be fancy, but the logic boils down to this: every word in a sentence depends upon the previous word. Or, more accurately, upon several previous words.

Let’s say you’re reading a book and the last word on the page is “cold”. What do you think comes next? “Feet”, “wind”, “snow”, and “night” are good candidates, whereas “sun”, “fire”, “cookies”, and “stegosaurus” are less likely, although not impossible.

Now, what if the last two words on the page are “a cold” versus “several cold”? You’d probably tweak your guess by suggesting “foot” in the first case and “feet” in the second.

In short, the more words we analyze, the more accurate our guess can be. But how many words is the right number? Too many words make long, slow programs, and too few words create algorithms that gets stuck: “I want a pony because I want a pony because I want a pony…” Some very smart scientists tackled this problem and decided that the magic number was 3.

Next problem: while both “wind” and “snow” are great candidates, how does the predictive keyboard know which is the absolute best?

The first step in creating your HMM is to chop up the training text into chunks. The sentence “unicorns have magical healing powers” becomes “unicorns have magical”, “have magical healing”, and “magical healing powers”. Next, you group identical segments together and count which words commonly follow them. So if the segment “unicorns have magical” appears in your training text 2,000 times, you might get results like this:

And: 320 / 2000 (16%)
Friends: 100 / 2000 (5%)
Horns: 1000 / 2000 (50%)
Healing: 580 / 2000 (29%)

So when this predictive keyboard sees the phrase “unicorns have magical”, it’ll choose “horns” as the next word in the sentence. Simple statistics!

Of course, real texts have more varied answers. Plus, the Harry Potter books contain over 1 million words — that’s a lot of segments to analyze!

Machine learning to the rescue!

Artificial Neural Networks (ANNs) are all the hype in computer science. They can recognize complex patterns like stop signs, individual human faces, or signs of cancer in a patient. They can also be used to create approximations for complex systems like Hidden Markov Models, often much faster than calculating the models directly.

Imagine that you’re trying to explain the concept of an elephant to someone that’s never seen one before, like an alien, or a toddler. You could try listing characteristics: elephants are big and grey, they have floppy ears, four legs, and they live in Africa and Asia. This would be the “traditional” approach to pattern recognition, and it often runs into trouble. For starters, baby elephants aren’t big. Plastic toy elephants can be pink and blue and green, and many elephants live in zoos all over North America and Europe. So now our definition looks like this:

Elephants are big (but sometimes small) and grey (but not necessarily), they have floppy ears, four legs (most of the time), and they live in Africa or Asia (but not always).

Sure to confuse any alien!

ANNs approach the problem using examples. They’d show picture after picture of elephants in different scenarios until the concept is crystal clear: elephants eating food, elephants sleeping, old elephants and baby elephants, elephants at zoos, cartoon elephants, elephants with riders. The bigger and more varied this “training set” of pictures, the more likely that the alien can recognize an elephant under any conditions. Fun fact: this is similar to the way humans learn!

What makes ANNs powerful is that they work well with “unstructured” data like pictures, videos, or text, and they’re faster and more accurate than other types of algorithms, especially ones that tackle the difficult math head-on. Using neural networks also ensures that your model can handle inputs it’s never seen before. If the phrase “rocket spaceship cookies” doesn’t appear in the training data, but pops up in a real-life scenario, our program won’t fall to pieces!

An imperfect model

After studying 10,000 sentences, the predictive keyboard knows that “several” should be followed by “puppies” and not “puppy”. But does it understand, deep down, the concept of plurals? Does it know what a puppy is? Of course not! HMMs and ANNs are simply clever tools to help computers imitate the ability to recognize objects or write sentences.

“Harry Potter and the Portrait of What Looked Like a Large Pile of Ash” wasn’t 100% computer-generated. The plot may be nonsensical, but the grammar is perfect; words were ultimately chosen by humans. Still, using an AI to guide their writing helped the creators at Botnik think outside the box. Way, way outside the box.

Current work in machine learning is hoping to improve text prediction by adding grammar to the model. For example, since words like “dog” and “cat” have similar roles, the sentence “dogs are cuddly” could be used train a model to create “cats are cuddly”, as well as “dogs are cute” and “dogs were cuddly”.

One of the biggest challenges in Natural Language Processing (NLP) — the branch of AI that’s tackling the transformation of human language into computer language — is the fact that grammar doesn’t follow clear, mathematical rules. Even the smartest algorithms make rookie mistakes that a four-year old could fix.

How many years until robot writers can pass for humans? Some scientists estimate that AIs will produce best-selling novels as early as 2040. Until then, we can enjoy their less refined work: complete with Ron turning into spiders and Hermione being dipped in hot sauce.

Learn More

The magic of statistics

Machine learning to the rescue!

An imperfect model

Learn More

Botnik Studios

Article about Predictive Keyboards on Phones

TEDx Talk about combining art and neural networks

Neural Networks in depth

When will robots write like humans?

XKCD comics about predictive keyboards