A one-page proof of Chaitin's theorem

G. J. Chaitin wrote a book on the proof of a very elegant theorem. Here's the proof in one page. (To be fair, Chaitin's book covers a lot more than just this theorem, but he does spend all of Chapter 5 on his proof.)

Definition: An elegant program is the shortest program that produces a given output. In other words, a program P is elegant if no program shorter than P produces the same output as P.

NOTE: "Program" here means code plus input data. Every elegant program produces a single, unique (but possibly infinite) output. Some obvious but important consequences of this definition are 1) every output from an elegant program is computable and 2) there are an infinite number of elegant programs, one for each possible computable output.

Theorem (Chaitin): It is not possible in general to determine whether or not a given program is elegant. (In particular, it is not possible to determine whether or not a program P is elegant if P is larger than a certain threshold size.)

Proof by reductio:

Assume that there exists en elegance-testing program ET. ET accepts as input a program P and outputs true if P is elegant and false if P is not elegant.

Construct a program B that takes as input a number N and enumerates all possible programs Pk longer than N. B runs the elegance tester ET on each enumerated program Pk in turn until it finds some Pk which ET claims is elegant. B then runs that Pk, thus producing the same output as that Pk.

Lemma: B must produce some output.

Proof: There are an infinite number of elegant programs, as noted earlier. So if ET works as assumed, B must eventually find one of those elegant programs whereupon it will produce that program's output.

Now run B with N set to the length of B plus 1 (See note 1). (This is the "threshold size" mentioned in the theorem.) B now will produce the same output as some program Pk which ET claimed was elegant. But Pk is longer than B, so Pk cannot be elegant because B, which is shorter, produced the same output. Therefore, ET was wrong when it claimed Pk was elegant. QED.

NOTES

1. Strictly speaking we have to set N to the length of B plus the length of the bitwise representation of N, i.e. N=length(B) + ceiling(log2(N)) + 1. Showing that this is possible regardless of how long B turns out to be is left as an excercise for the reader.

2. J. P. Lewis wrote an interesting paper on some practical consequences of this result.

3. If you really want to blow your mind, read this, particularly the last paragraph.

While we're at it, here's a half-page proof of the non-computability of the halting problem:

Assume that there exists a program H that solves the halting problem, that is, H takes two input parameters, a program P and an input I and returns true if P halts when run on input I, false otherwise.

Construct a program B that does the following: B accepts as input a program P and computes H(P,P), that is, it calls H as a subroutine to determine if program P halts when run on a copy of itself as the input. If the result from H is true then B enters an infinite loop, otherwise B halts (that is, B does the opposite of what P would do when run on a copy of itself).

Now consider what happens if we run B on a copy of itself. B will halt iff H(B,B) is false. But H(B,B) is false iff B doesn't halt, which is a contradiction. Therefore, our assumption is false, and H is impossible. QED.

Turing's original proof of this theorem was remarkable (and complicated) because in his day there were no computers. Not only did Turing have to more or less invent the idea of a program, but he had to conceive of the rather mind-bending idea of running a program with a copy of itself as input. Nowadays this is commonplace. The unix command "cat `which cat`" is one of the simplest examples of a program operating on itself as input. A compiler that compiles itself is another.