ASCII

Windell Oskay on Flickr

If you write an email on your computer screen and you want to send it to someone else to view on their computer screen, how do your computers know how to display every character, word, sentence, and paragraph in the email?

Your computers need a map to identify all possible letters, numbers, and other characters. The capital letter M, for example, would have a unique identifier different from a unique identifier for lower case m. This makes it easy to send an email to a friend and know they’ll be able to read it on their computer.

In its simplest form, ASCII is a map computers use to identify letters, numbers, and other characters.

Of course, the first maps computers used had to evolve to identify characters from many other languages. And the word ASCII also is called text, as in text editors, to differentiate it from binary files which are a different topic.

How does your computer tell the difference between the letter M you type in a word processing program and a lower case letter m? Or, for that matter, the number 5 from the number 2?

Solving this problem requires a standard set of numbers be assigned to letters, numbers, characters, and other common elements like a space, delete, and null. The basic standard is called ASCII, short for American Standard Code for Information Interchange, or US-ASCII because it maps codes to US English language letters. ASCII is pronounced “ask-ee.” To make things more fun, the ASCII standard was created by ANSI, short for American National Standards Institute.

Also fun and intereresting, the ASCII table still includes characters we find odd today, for example, BEL for bell, as in ring a bell, and DC4 for Device Control 4. These characters and values were used often in early computers but not modern computing devices.

In the 1940s and 1950s, computers expanded from calculations to storing data. To make it easier for computers to store data, letters, numbers, and other characters were assigned two-digit numbers. This allowed computer data to be stored in multiple computers. In the 1950s, these two-digit numbers were translated into 7 digit (or bit) combinations of 1 and 0, the two binary numbers computers translate as on and off. 7-bit binary numbers also allowed a bigger set of characters for computers to use to store data.

While 7 digits (or bits) were used in the original ASCII tables and standards, the ASCII 7-bit tables actually used 8 digits, for example, 01000000. The extra eighth digit was used to detect transmission errors based on whether or not it was a 0 or 1. Eight digits also is made up of four two-digit pairs.

If you don’t know, binary numbers are a way to write down and track numbers. Most modern civilizations use the decimal system where numbers are expressed as units of ten. So a binary number is simply a number expressed units of 2, that is, 0 or 1.

The combination of binary numbers in groups of eight, and more recently sixteen, are still expressed in units of two. The decimal number 2 in the decimal system is expressed as the binary number 00110010 in the binary system used by the ASCII standard. In ASCII, 00110010 translates to the decimal number 50 which also can be used to store the number 2; however, computers use binary numbers so they use 00110010 to store the value of 2.

The original 7-bit ASCII tables also handled only 128 possible letters, numbers, and other characters. When you counted up A-Z, a-z, 0-9, and common punctuation, the total was 90-something of the 128 possible characters.

The first 7-bit ASCII table of binary numbers was published in 1963. Earlier tables existed in the 1950s to ensure text could be shared and printed by different computers and printers. However, ASCII became a formal standard only later in the 1960s.

Over time, the number of characters and languages that needed to be encoded as bits expanded beyond the 128 possible characters in the original ASCII standard. The ASCII standard evolved to use the eighth digit to represent data not to detect transmission errors. In the 1990s, for example, computers handled data in Chinese and Hindi in addition to English and European languages.

Eventually character encoding for computers expanded to 16 digits or bits of binary 1 or 0 numbers known as UNICODE. This standard allows for tens of thousands of possible characters from many different languages. Until December 2007 ASCII was used on the web when it was replaced by the UTF-8 character set which includes ASCII. UTF is short for Unicode Transformation Format. The first 128 characters defined in UNICODE are the original ASCII character encoding assignments.

And, if you’re wondering, the UNICODE standard was defined by the ISO or International Standards Organization, a group that works with industry and others to define common standards that allow computers and other devices to work efficiently.

While the original 128 ASCII characters have been expanded to thousands, they are still a critical part of how computers handle data transmitted and stored in web pages, email, and software applications. For example, your computer translates the binary number 01001101 computers translate as upper case M. You see the letter M in your word processing software. Your computer sees 01001101. And when you email your file to someone, their computer sees 01001101 while their screen displays M.

Learn More

ASCII

https://en.wikipedia.org/wiki/ASCII
http://www.neurophys.wisc.edu/comp/docs/ascii/

ANSI

http://www.ansi.org/

ISO

http://www.iso.org

UNICODE

http://www.unicode.org/
http://www.unicode.org/charts/

Character Encoding

https://en.wikipedia.org/wiki/Character_encoding

UTF-8

https://en.wikipedia.org/wiki/UTF-8

Control Character

https://en.wikipedia.org/wiki/Control_character

Binary Code

https://en.wikipedia.org/wiki/Binary_code

Numeral Notation Systems

https://en.wikipedia.org/wiki/Numeral_system
https://en.wikipedia.org/wiki/Octal
https://en.wikipedia.org/wiki/Decimal
https://en.wikipedia.org/wiki/Hexadecimal
https://en.wikipedia.org/wiki/Binary_number