How to Build a Programming Language
Image by Quinn Dombrowski on Flickr
Software languages don't magically appear. They're created by design. First in a series.
Computers are amazing pieces of engineering, but they’re useless if we can’t feasibly instruct them on what to do. It’s no surprise then that almost as long as we’ve had computers, we’ve had some kind of special languages we can write in that makes programming possible.
There’s dozens of popular programming languages today, and countless more that people have created for fun, for research, or as experiments. If you’re interested in how to create your own programming language this series of articles is going to help you gather ideas and get started. Designing and implementing languages isn’t as hard as you might think, but it is a big topic with a number of steps that you need to do.
In this first article, we’ll talk about what goes into making a language and get you started thinking about what your language will look like.
Design versus Implementation
First, I want to point out that there’s a difference between designing and implementing a programming language. An implementation of a programming language is a program that turns the text of the programs into actions on the computer. There’s two kinds of implementations: interpreters and compilers.
Both compilers and interpreters start the same way. They read the program that you want to run from a file and then turn it into a representation of the program as data.
What happens from there is different though. An interpreter will take this representation of the program and execute the program immediately. A compiler, on the other hand, is a program that reads the code and transforms it into another language. Often, a compiler turns a program into the low level instructions for the processor itself: either the actual machine code or the human readable step just above that, called assembly.
We need to define a few terms now. The act of reading the program from the file is called parsing and the bit of code that does this is called the parser. The representation of code as data is the abstract syntax tree (AST).
For example, imagine we’re writing an interpreter for Python, in Python. That might sound silly, but it happens more often than you’d think. Our program might look like
for i in range(0,10): print(i)
but from the perspective of the interpreter it might represent this program as something like
forLoop = For(Var("i"), Function("range",Int(0),Int(10)), BuiltIn("print",Var("i")))
where For, Var, Function, Int, and BuiltIn are classes that the the interpreter writer had to create. You can picture the structure of program as something like

Then, once the interpreter has created this AST, it will execute the code. This means taking apart the objects that represent the program and executing corresponding actions. A for-loop will get turned into something that runs repeatedly. Variables become set aside data that can be retrieved later. Built-in operations like print will write output to the console. Now, in this case that’ll be a little trivial because it’s easy to translate Python concepts into Python code. But if you were writing a Python interpreter in something like Haskell, a Python loop is executed as a Haskell function.
Now, to actually make a language I recommend starting with an interpreter. It’s generally less complicated than a compiler, because in an interpreter you only need to execute the code instead of figuring out how to translate it into code in a different language that still does what you want.
Here’s the basic steps and order I recommend for writing a programming language:
- Designing your language
- Creating the AST for the language
- Writing the code to execute the AST
- Choosing what the language should look like
- Writing the parser
We’ll preview the first two of these in this article and the rest will be in future installments.
Designing your language
Actually coming up with a language is the place to let your creativity shine. Start thinking about what would be cool or weird or interesting to see in a programming language.
You’ll need to think, though, about some basic questions for how the language will work.
- How do you iterate in your language? That is, how do you execute the same steps multiple times?
- How do you make choices in your language? You’ll need some way of making decisions about when something should happen. Most languages do this with some form of if-statement.
- What kind of data will you have: numbers, strings, lists, etc.?
- How will functions work?
- Do you want to be able to create concurrent threads?
- Are there any languages you’re inspired by?
- Are there any languages that you almost love but wish you could fix?
There’s even more design decisions to think about, but those are good starters.
Choosing what your language should look like
You may have already had a picture in your head of how your language will look. The actual way the language looks when written out is called its syntax. I recommend writing code by hand to figure out what syntax you’d like to use. If you don’t like the way your language works, it’ll be a lot harder to actually use it. I recommend picking the syntax of a language you like and building off of that. After all, so many languages stole the syntax of C and tweaked it a little bit, so you can borrow from languages you like too.
It’s a good idea to “fake” some programs in your new language with pen & paper and then make notes about what they should do.
If you spend some time thinking about these two topics and planning things out, you’ll be ready to start writing code for next time. Until then, check the further reading for some other tutorials on writing interpreters.
Learn More
Tutorial on Creating a Simple Interpreter in Python (old)
http://www.norvig.com/lispy.html
More gritty details, less about design
https://ruslanspivak.com/lsbasi-part1/
Haskell and writing a simple interpreter
https://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours
How To Create a Programming Language
https://www.kidscodecs.com/build-a-programming-language-1/
https://www.kidscodecs.com/designing-programming-language-part-ii/
Also In The June 2017 Issue

A fun way to learn basic electronics and coding with Python but minus the old rotting hat.

A simple animation is a fun way to play with and learn the Python programming language.

3D printers can be used to print patterns on dresses and fabrics.

Can we measure the time and steps required for things to happen?

This Canadian experiment used a robot to explore how people respond to robots and technology.

An amazing new book turns math problems into shapes and illustrations.

This pen and paper project helps organize ideas into stories with a finite state machine.

While you can't use soap and water on your code, you can keep your code as sparkly clean as any dish or silverware.

Here's a fun math problem you can work out with pen and paper as well as Python.

A pen and paper computer that can do what computers do today.

This project explores the basics of using Google's Static Map software to display your own maps.

Learn how to code the hangman game in C#. Includes link to full code.

This project uses basic math skills, a text editor, and a web browser to draw simple pictures.

Most people love cookies. But these cookies are the kind that make the internet possible.

Links from the bottom of all the June 2017 articles, collected in one place for you to print, share, or bookmark.

Interesting stories about computer science, software programming, and technology for June 2017.

Software languages don't magically appear. They're created by design. First in a series.