A kind reader pointed out that I frequently mention algorithms, but that I haven’t defined them in the basics posts. To me, they’re so fundamental to the stuff I do for a living that I completely forgot that they’re an interesting idea.

It’s really pretty simple: an algorithm is description of a mechanical procedure for performing a mathematical task. But that’s actually a fairly sophisticated idea – the general concept of *things* that describe a procedure, and which can be discussed, described, and reasoned about as entities in their own right is something quite different from nearly anything that came before it.

The term algorithm has an interesting and humorous origin. Most people believe that it’s named after a Persian mathematician named “Al-Khwarizmi”. That’s actually not quite true. While the term *is* derived from Al-Khwarizmi’s name, it’s actually an accident. In fact, what happened was that Al-Khwarizmi wrote a book about how to perform calculations using the Hindi numbering system (which is what we know as arabic numerals). Several hundred years later, the text was translated into Latin, and in the process of transliterating the name “Al-Khwarizmi” into Latin, the title of the book became “Algorithmi on Indian Numbers”. “Algorithmi” *looks* like a latin plural noun – and so many European mathematicians interpreted the title of the book not as “Al-Khwarizmi talking about mathematical procedures using Indian numbers”, but “Mathematical Procedures using Indian Numbers”. And the term stuck.

So – an algorithm is a description mathematical procedure – and it’s possible to reason about an algorithm as an entity itself. Let’s look at a simple example, just to get the sense of it.

Suppose I have a list of N numbers in random order. I want to get the list into

increasing order – smallest first, largest last, ever number smaller than the number that comes after it.

One of the simplest algorithms for this is called *bubble sort*:

Given: a list L_{1}...L_{N}of N numbers in random order.Produce: a list of those N numbers in increasing orderDo:repeat the following until the list no longer changes: look at each pair of adjacent numbers L_{i}and L_{i+1}: if L_{i}> L_{i+1}then swap L_{i}and L_{i+1}

This *will* produce a sorted list. Let’s run through a quick example: the list [3,5,2,1,4].

- First pass through the list:
- Compare 3 and 5. 3<5 so move on to the next pair.
- Compare 5 and 2. 5>2, so swap. List becomes [3,2,5,1,4]
- Compare 5 and 1. 5>1, so swap. List becomes [3,2,1,5,4]
- Compare 5 and 4. 5>4, so swap. List becomes [3,2,1,4,5]

- The list changed, so we take another pass. Second pass:
- Compare 3 and 2. 3>2, so swap. List becomes [2,3,1,4,5]
- Compare 3 and 1.3>1, so swap. List becomes [2,1,3,4,5]
- Compare 3 and 4. 3<4, so move on.
- Compare 4 and 5. 4<5, so move on.

- The list changed, so we do another pass. Third pass:
- Compare 2 and 1. 2>1, so swap. List is now: [1,2,3,4,5].
- Next four comparisons are (2,3), (3,4), (4,5), none of which cause swaps.

- The list changed, so another pass. Fourth pass, everything is in order – each pair compares without causing changes. So the result is [1,2,3,4,5].

What can we say about this algorithm? Well, for one thing, it’s slow. Given a list of N elements, we might need to do up to N passes over the list, each of which will do N-1 comparisons, each of which might cause a swap. So in the worst case, this will require N^{2}-N comparisons and swaps to get the list into order. Our example wasn’t even a worse case, and it ended up needed to do 16 comparisons (4 passes, 4 comparisons per pass), and 6 swaps. In the best case – the list is already in order – this algorithm will take one pass – N-1 comparisons, 0 swaps.

There are even some tricks using Kolmogorov-Chaitin information theory where we can show that on average, Bubble-sort will tend more towards the worse end of its possible performance than the best – but that’s beyond the scope of a basics post, except to point out that we can apply interesting mathematical analyses to *the algorithm itself* to figure out what it’s average performance should be, even without running it.

WalkerNow that you have done a post on both algorithms and Turing Machines, you should do a post on Church’s Thesis.

FoxyWhat’s your favourite algorithm?

ZenithI enjoy your method of teaching Mark, I’ve been reading your blog for 3 months now (or so), always a nice read, especially the plonky ID drama! 🙂

This just adds to a growing list of re-readable material! 🙂

John ArmstrongIf you do the Church-Turing thesis next you might also tack on this “Elementary Proof that P =/= NP”.

PseudonymSince the question was asked, my favourite algorithm is the Hopcroft state minimisation algorithm. It’s one of the simplest algorithms which has the following properties:

1. What it does is easy to understand.

2. Writing a simple algorithm to solve the problem is easy, and it’s easy to understand.

3. Taking one of the simplest algorithms, and making a very simple (a one line change in a typical programming language) modification to it, improves the running time from O(n^2) to O(n log n).

4. This modification is precisely the opposite of what you’d expect.

dileffanteGood work Mark, as usual. And this short post of J. Shallit over at Recursivity makes a nice complement, I think.

Torbjörn LarssonWhile I’m sure yesterdays human calculators were interested in how much time (how many operation) they used, I don’t think they were much concerned with the number of papers. I have a feeling that computer science may have contributed with the complementary algorithm performance measure of needed memory space.

joltvoltaAs someone who has a little programing experience (java), and understands the programing perspective of an algorithm, and the silliness that is a bubble sort, but has very basic math skills. Can you explain the difference between an algorithm used in programing, and an algorithm used in mathematics. Please excuse me if this seems odd, but my math skills suck. =( I’m working on it…

— jolt

Torbjörn LarssonThe synchronicity of the web – speaking of performance measures, Jeffrey Shallit has a post where he relates his article to Spiked magazine on the “greatest innovation in my field”:

( http://recursed.blogspot.com/2006/12/greatest-innovation-of-theoretical.html )

Hmm. Randomness – is that a scarce resource due to the use of pseudorandom algorithms and/or finite number size? And if so, wouldn’t that fall back to the constraints in time and space?

(Btw, every computer could easily have a circuit who gives good random numbers from a noise or quantum process – and it seems such circuits are WIP now.)

BenIs there any chance you might write about some other sorting algorithms?

I find them to be fascinating. Anyone who is interested in such things will probably like this site:

http://www.cs.ubc.ca/~harrison/Java/sorting-demo.html

Tyler DiPietroHmm. Randomness – is that a scarce resource due to the use of pseudorandom algorithms and/or finite number size? And if so, wouldn’t that fall back to the constraints in time and space?I’m not quite what Shallit meant by that statement, but there are two distinct ways in which algorithms use randomness:

1.) Pseudoramdomness, which as you mention has some overlap with time and space. It however distinct due to it’s use, for instance, as a statistical judgement of whether the output you get is “rigged” or not in randomized algorithms.

2.) Incompressibility: Turing machines are modeled with self-delimiting versions of their individual inputs bordered by “blanks”, which goes back to the definition of “randomness” in Kolmogorov complexity, which is used to minimize redundancy.

Tyler DiPietroJesus Christ, I’m on my way to typo-city for a stay at the word-gap hotel. Sorry for the messiness of the above post. 🙁

TamIt might be helpful to discuss a simple algorithm that people are familiar with, if you’re trying to teach what an algorithm is in general. For instance, the algorithm that people use to multiply numbers of several digits (which is made possible by the use of arabic numerals), where you line up the numbers one above the other and multiply the digits, carrying the tens to the next column, etc.

I enjoyed learning sorts in computer science classes and comparing them to my own sorting algorithms. Many of them (but not heapsort!) had obvious analogues in my own behavior.

MarkkRandomness as a resource: Even on the idealized machines, the question of how many, say, uniformly random bits, you need for some algorithm is often an important question. Do I just need 4 random bits say, or do I need some number based on the size of the input. That is why it is looked at as a resource in the same way space or time are.

Daniel MartinIntel had such a device, and included it on certain consumer-level Intel motherboards produced between 1999 and 2001. You may somewhere have a seven-year-old computer with a hardware random number generator in it.

It turned out that there wasn’t really a need in most of what drives the consumer electronics market for a device – as of now, good-enough randomness is cheap even without such a hardware device.

Torbjörn LarssonThanks Tyler, markk and Daniel for setting me straight:

Ah yes, Shallit may have meant it as a quality resource.

I didn’t think of use as an internal metric, but that makes sense too.

Um, it was likely an old memory. I guess I should have checked the current situation first.

Glad to hear it. Considering the above remarks it seems I misread Shallit completely here.

Alex RIt’s easy to give examples of algorithms, as you have done. But it seems that “algorithm” is very difficult to *define* in a precise way.

Why? Let’s give the example of sorting a list of integers. You can write many computer programs that perform this task, but we don’t want to say that every different computer program implements a different algorithm. There are many programs that implement a bubble sort algorithm which differ only in trivial implementation details such as the location of temporary storage. We also can’t define it in terms of the actual function calculated: we want to be able to say that bubble sort and merge sort are different algorithms, even though they calculate the exactly the same function.

We would like our understanding of the concept of “algorithm” to define some kind of equivalence relation on computer programs, such that any two programs that implement the same algorithm are defined to be equivalent. But the correct definition of such an equivalence relation seems quite mysterious to me…

Torbjörn LarssonAlex, what you discuss seems to me like the problems of defining for example the concept of species. Depending on what you study (living populations or fossils) you need different conceptions.

Here you need different scope depending on if you study the idealized algorithm, the intended implementation or the real algorithm.