< Back

I've read The Art of Computer Programming, Chapter 1

I've decided to tackle The Art of Computer Programming (TAOCP), an absolutely massive work (in progress) by Donald Knuth currently spanning 5+ volumes. The last few months were devoted to the first chapter alone, as I slowly inched my way through its 500+ exercises. I'd like to share some of the joy I experienced in the process.

Chapter 1 begins with a heavy dose of mathematical preliminaries. Frankly I considered skipping it and moving on to the programming content, but ultimately I do not regret taking my time here. The exercises are genuinely some of the best I've seen in a math text: although they deal with basic topics like summations, logarithms and Fibonacci numbers, they reveal a whole universe of interesting facts and theory that you won't see in any other "introductory" text. They are also delightfully instructive in reinforcing or sometimes extending the techniques presented in the text. For example, 1.2.3 "Summations" introduces the basic change of summation identity

\begin{equation} \label{1} \sum_{i=1}^n\sum_{j=1}^ia_{ij}=\sum_{j=1}^n\sum_{i=j}^na_{ij} \end{equation}

which is then used to derive the identity

\begin{equation} \label{2} \sum_{i=0}^n\sum_{j=0}^ia_ia_j={1\over2}\bigl(S_1^2+S_2\bigr),\quad S_k=\sum_{i=0}^na_i^k. \end{equation}

To see why \eqref{1} is true, one can arrange the elements $a_{ij}$ in a triangle and observe that the expression on each side corresponds to summing by rows or summing by columns. Alternatively, the two summations are equal to a single summation over the set

$$\{(i,j)\mid1\le j\le i\le n\}.$$

Once the reader has internalised this fact, they can generalise it to $n$-fold summations. When $n=3$, there are $3!=6$ ways to perform a triple summation over the set

$$\{(i,j,k)\mid1\le k\le j\le i\le n\},$$

namely

$$\sum_{i=1}^n\sum_{j=1}^i\sum_{k=1}^j = \sum_{i=1}^n\sum_{k=1}^i\sum_{j=k}^i = \sum_{j=1}^n\sum_{i=j}^n\sum_{k=1}^j = \sum_{j=1}^n\sum_{k=1}^j\sum_{i=j}^n = \sum_{k=1}^n\sum_{i=k}^n\sum_{j=k}^i = \sum_{k=1}^n\sum_{j=k}^n\sum_{i=j}^n\;.$$

This was one of the ideas I used to solve 1.2.3–29(b), which asks to extend \eqref{2} by expressing

$$\sum_{i=0}^n\sum_{j=0}^i\sum_{k=0}^ia_ia_ja_k\quad\text{in terms of}\quad S_1,S_2,S_3.$$

Some of my other favourite exercises are: 1.2.4–43; 1.2.6–10; 1.2.7–15,17; 1.2.8–17,18,37; 1.2.9–6,19.

This is not to say that only the problems were interesting. There was some cool math to be enjoyed from reading the later sections on generating functions and asymptotic expansions. The final section of the mathematical preliminaries, 1.2.11.3 "Some Asymptotic Calculations", was by far the most challenging. Despite Knuth's claim that this is all "elementary calculus", you need a solid grasp of real analysis to follow the math. But for the mathematically inclined reader, it is a truly awesome demonstration of how one can derive bewildering approximations like

$$1+{n\over n+1}+{n\over n+1}{n\over n+2}+\cdots = \sqrt{\pi n\over2}+{1\over3}+{1\over12}\sqrt{\pi\over2n}+{4\over135n}+{1\over288n}\sqrt{\pi\over2n^3}+O(n^{-2}).$$

Reading this section was like watching a musician perform a technically demanding concerto.

The second half of Chapter 1 is about the fictional computer architecture MIX. I am aware that MIX is outdated in favour of MMIX, but I stuck with it anyway because I have a fascination with old computers, with their punched cards and typewriters and magnetic tapes. Sometimes I wish I were around in an earlier era to play around with these big machines, whose inner workings were more transparent than today's computers. Anyway, I digress.

After teaching the basics of MIX and the assembly language MIXAL, Knuth introduces some algorithms for multiplying and inverting permutations (stored as arrays) in 1.3.3. They are certainly interesting in their own right (Boothroyd's algorithm for inverting permutations is mind-bending), perhaps even useful (because they modify the array in-place, not allocating extra space). But Knuth's point is not to present a bunch of algorithms, but to demonstrate how an exact runtime analysis can be performed. With his signature attention to detail, Knuth meticulously counts the number of executions of each instruction, and assuming each instruction has a fixed cost (admittedly a tenuous assumption with today's computers), he derives the exact runtime of the program. I felt that working through this section sometimes required a bit of a Zen mindset: you just shut up and do the computation, and all that work will be for a single number (check out 1.3.3–10). Nevertheless, 1.3.3 also contained juicy material on the theory of permuations like the Josephus problem; the mathematically-inclined reader might also find 1.3.3–22 and 37 intriguing.

Following permutations, Knuth discusses some programming constructs. Once again his penchant for precision shows: Knuth provides exact expressions for the time and space savings of using subroutines over copy-pasting the same code in multiple places. This is mostly useless knowledge, but I very much appreciate Knuth for showing that such analysis can be done. It was also here that I learnt about coroutines for the first time. Instead of having a method call a submethod which runs some code and then returns back to that method, we could have two or more methods continuously calling each other. It's a notion that I find easier expressed in assembly than a higher-level language. (Here's one of my solutions that involves three coroutines.)

After coroutines, Knuth goes through the implementation of a simulator and tracing routine for MIX, written in MIX itself. It's not as scary as it sounds, and some of the exercises gleefully embrace its meta nature (what if we had a tracing routine trace itself?)

In order to run Knuth's MIX examples and work on the programming exercises, I wrote my own MIX "software suite" consisting of the base emulator, a MIXAL assembler and a user-friendly debugger. This itself was an instructive experience, because I've never before written an emulator, implemented a parser according to an exact specification, nor iterated on a UI with useful feedback (from myself).

I am quite proud of some of the features I've implemented, because I don't see them in other MIX distributions ☺. These include the ability to view execution counts and times (so I can confirm that DIV instruction in line 19 of Program P was executed 9538 times), as well as custom simulations of I/O delays, to add a touch of realism to 1.4.4 "Input and Output". Also I can't resist sharing this: I've added to Knuth's MIX simulator the ability to load MIX programs from punched cards (which are actually just text files with a slightly extended MIX character encoding). So if you desire, you can relive some of the joy of programming in the old days.

And that's my journey through Chapter 1! I plan to continue with Chapter 2, though I'll be busier with university work. For the interested reader, I have some final thoughts to offer:

I agree with the consensus that for most aspiring readers, there are better textbooks to choose from. This book really gets into the weeds, and following it all the way through requires a significant time commitment, without an immediate practical payoff. Yet, the intellectual stimulation provided by TAOCP is sufficient motivation for some people like me.
To grasp the material in TAOCP, it is obviously necessary to do many exercises. At the same time, working through them has made me realise how much I haven't grasped. For example, I've skipped many exercises in 1.2.6 "Binomial Coefficients", and I consider this to be a gaping hole in my understanding.

Also, Knuth's difficulty ratings for the exercises can sometimes be humbling. I've often agonized over 20+ rated exercises, only to see Knuth effortlessly present the solution in a few lines (might I add it's sometimes really non-obvious). Therefore, this book can have a curious effect: the more you work through it, the less you know.
Despite the gloomy-sounding remark above, I strongly recommend anyone brave enough to work through TAOCP to read all the solutions. I've learnt so much just by seeing how Knuth approached an exercise, even if I've solved it on my own (though I've perhaps given up on exercises more often). Sometimes he inserts additional remarks, like the motivation behind the exercise or references to the literature.

I think this can be contrasted with a certain attitude with math readers: "I have to solve every exercise on my own". While there is some truth in that you should first invest some effort into the exercise, I feel that adopting such a hardline stance can shut off valuable learning opportunities when you can't solve it.
If you know some programming, learning MIX/MMIX is not as scary as it sounds; I'd say it requires a few days. In fact, doing all the exercises in 1.3.1 and 1.3.2 will quickly acquaint you with its nuances.

Addendum: I got a 0x$1.00 check for spotting an error in 1.4.3.1 "Interpretive Routines", page 211. Knuth claims that "the coding for operation codes 5, 6 and 7 has been left as an exercise", but actually opcode 5 (SPEC) was not! I only discovered the error after attempting to implement the emulator myself: the provided code did not handle HLT, thus it was unusuable in that state.