On Sorting

Dennis Jarvis on Flickr

Would you leave the books on the shelf and just move them one a time? Would you first put them in stacks of A’s, B’s, etc. and then sort the stacks? Maybe you’d start by picking a book and first dividing the shelf into all the books that would come before it and all the books that would come after it.

Alphabetizing is an example of a type of sorting. Sorting is when you take a number of things and put them in order. You could be sorting by date and time, by name, by author name, by size, or any notion of ordering that says “this should come before that”.

Sorting is a deeply explored topic in computer science. Computer scientists care about sorting for the same reason a bookstore cares about alphabetizing its shelves: it makes finding things much easier.

One of the simplest and most intuitive ways of sorting is called the bubble sort. You may have discovered this just by trying to put things in order!

It works like this: start from the left hand side of the list of items, until you reach the end for each element, compare it to the element to its right. If the element on the left is bigger, swap them. If the element to the right is bigger, leave them alone. Repeat with all the elements until the list is in order.

Here is a bubble sort shown as a Python algorithm that sorts a list of numbers:

def bubble(lst,verbose=False):
    n = len(lst) - 1
    isSorted = False
    while not isSorted:
	isSorted = True
	for i in range(0,n):
	    if lst[i] > lst[i+1]:
		if verbose:
		    print(“Swapping: “ + str(lst[i]) + “ and “ + str(lst[i+1]))
		tmp = lst[i+1]
		lst[i+1] = lst[i]
		lst[i] = tmp
		isSorted = False
    return lst

Here’s an example of how a bubble sort works. Start with the list [5,2,3,9,9,4].

  1. We start on the left and see that 5 is greater than 2, so we swap them and the list becomes [2,5,3,9,9,4].
  2. In the next round we can see that 5 is greater than 3, so we swap again and get [2,3,5,9,9,4].
  3. The third time through, 5 isn’t greater than 9, so the list stays the same [2,3,5,9,9,4].
  4. Next pass through: since 9 isn’t greater than 9, the list stays the same again [2,3,5,9,9,4].
  5. After that since 9 is greater than 4 we do another swap and reach the end of the list [2,3,5,9,4,9].

We’re not sorted yet, because of that 4, so we start the process over again. The next few steps leave the list unchanged until we compare 9 and 4 where we swap again

[2,3,5,4,9,9].

Since 9 isn’t greater than 9 we again reach the end of the list and start back at the beginning. In the next round nothing changes until we swap the 5 and 4, and nothing changes again after that, leaving us with [2,3,4,5,9,9].

Once we reach the end of the list in this third round of swapping, we can make one last pass through and discover that it’s in order!

You can try running the Python code above and it’ll give you the same result. If you pass in True as a second argument to the code you’ll see what swaps are being made in the sort.

This isn’t a very good sorting algorithm, though, because sometimes you have to make many passes through the list to move things into the right position.

If you had a 0 at the end of the list instead of a 4 in our example you would have to have run through the entire list two more times to get the single 0 in the right spot. This means you’d have to make about 30 comparisons of elements just to sort a list of six elements. If you had a list of 1000 numbers and the smallest number was to the far right of the list, you’d end up having to do almost a million comparisons to sort the list. For a list 100,000 numbers long, you could end up doing about 10 trillion comparisons to sort the list. Even on a fast computer that means you’d be waiting at least 15 minutes just for a list to be sorted. Imagine trying to sort a “bookshelf” as big as Amazon’s database of books, which could take at least 10000 times longer still. That’s about 100 days in the worst case!

Thankfully, computer scientists have come up with many other ways of sorting: quick sort, merge sort, bucket or radix sort. These methods divide the problem of sorting into smaller pieces than just “let’s compare every number”.

A bucket or radix sort splits the list up by something less precise first, like the starting letter for alphabetizing books, or by the digit in the 100s place if all your numbers are less than 1000. Once you’ve created these coarse-grained buckets, you can use whatever kind of sorting you want inside the small buckets. Even a bubble sort!

In a merge sort you split the list into very tiny pieces and then combine them, or merge them, so that they’re in order.
Finally, in quick sort you pick an item, then split the list into two parts: the things bigger than the item you picked and the things smaller. You then sort the two halves by doing the quick sort on each of those pieces.

If you’re interested in learning more about sorting algorithms, look at the back cover of this magazine for links to code and explanations.

Learn More

Visual Algorithm Sorting

https://visualgo.net/en/sorting

Algorithm Visualizations: Comparison Sorting Algorithms

https://www.cs.usfca.edu/~galles/visualization/ComparisonSort.html

RosettaCode Sorting Algorithms

https://rosettacode.org/wiki/Category:Sorting_Algorithms

Geeks for Geeks: Sorting Algorithms

http://www.geeksforgeeks.org/sorting-algorithms/

Sorting Out the Basics Behind Sorting Algorithms

https://medium.com/basecs/sorting-out-the-basics-behind-sorting-algorithms-b0a032873add