1.2.1-13
--------------------

A3. T <= 3(n-r-1)+2
A4/A5. T <= 3(n-r)
A6. T <= 3(n-r)+1

The idea is that each execution of E2 causes r to decrease by at least
1 (because the previous r is the current d, and the new r is the
remainder when dividing by d).


1.2.2-25
--------------------

We have to show two things:
1. The algorithm terminates.
2. It computes an approximation to log(x).

I'll first show 2 by proving that if we allow infinite precision and
let the algorithm run forever, it will compute log(x) to arbitrary
precision. What the algorithm essentially does is greedily find a
representation of x as an infinite product of numbers 2^k/(2^k-1)
(possibly with repetition); then by taking logs and summing up the
precomputed values log(2^k/(2^k-1)) we obtain log(x) exactly. The
first few numbers of this form are

2/1 4/3 8/7 16/15 32/31 64/63 128/127 256/255 512/511 1024/1023 ... -> 1.

With infinite precision, a right shift by n bits is the same as
division by 2^n. What step L3 does then is to find the smallest k such
that

x-z = x-x/2^k = x*(2^k-1)/2^k > 1, i.e. x > 2^k/(2^k-1).

In other words, 2^k/(2^k-1) is the largest number of that form less
than x. Then we subtract z from x (i.e. divide 2^k/(2^k-1)) and rerun
step L3 on it to obtain (2^k1-1)/2^k1, and so on. This gives us an
infinite product representation of x.

The infinite product does converge to x, because the sequence of x's
converge to 1. (To rigorously prove it, suppose xn is the nth term and
kn is the k we get for xn at L3. It can be shown that kn is
nondecreasing and approaches infinity, and since xn <
2^(kn-1)/(2^(kn-1)-1) it approaches 1.)

--

With finite precision, the algorithm terminates because:
1. There are finitely many representable values between 1 and 2.
2. x-z is always strictly less than x.

1 is obvious and I leave it up to the reader to verify 2. This
concludes the exercise.


1.2.2-27
--------------------

From the question Knuth defines

x/10^n (1-d) <= x0' <= x/10^n (1+e),
(xk')^2 (1-d) <= yk <= (xk')^2 (1+e).

Let's compute the upper bound for xk' step by step:

x0' <= x(1+e) / 10^n
(x0')^2 <= x^2(1+e)^2 / 10^(2n)
y0 <= x^2(1+e)^3 / 10^(2n)

Since

x1' = y0/10 if y0 > 10, y0 otherwise

we have

x1' <= x^2(1+e)^3 / 10^(2n+b1)

Similarly,

x2' <= x^4(1+e)^7 / 10^(4n+2b1+b2)
x3' <= x^8(1+e)^15 / 10^(8n+4b1+2b2+b3)
...

In general,

{10^(log'x) * xk'^(1/2^k)}^(2^k)
= 10^{2^k(n+b1/2+b2/4+...+bk/2^k)} xk'
<= x^(2^k) (1+e)^{x^(2^(k+1)-1)}
<= x^(2^k) (1+e)^{x^(2^(k+1))}

so

10^(log'x) <= x(1+e)^2 / xk'^(1/2^k) <= x(1+e)^2

taking logarithms of both sides yields

log'x <= log10(x) + 2log10(1+e).

As for the lower bound, we obtain by similar means

10^(log'x) >= x(1+d)^2 / xk'^(1/2^k),

and note that 1 <= xk < 10 implies 1/10^(1/2^k) < 1/xk'^(1/2^k) <= 1, so

log'x > log10(x) + 2log10(1+d) - 1/2^k

as desired. Note how we made crucial use of the fact that xk is
bounded.

--

Now let's specialise to the example in the text. Numbers are rounded
to 3 decimal places. Furthermore, all numbers in the computation lie
in [1,10), so we can compute

1+e = (maximum amount a number can be scaled up by rounding to 3 decimal places)
= 1.001 / 1.0005
~~ 1.0005

1-d = (maximum amount a number can be scaled down by rounding to 3 decimal places)
= 1.000 / 1.0005
~~ 0.9995.

The upper error is

2log10(1+e) ~~ 0.000434,

and after many steps the lower error is around

2log10(1-d) ~~ -0.000434.

This matches with Knuth's figure of 0.00044 for the error.


1.2.2-29
--------------------

(a) Derivative of b*log_b(x) = b*ln(x)/ln(b) wrt b is

ln(x)(ln(b)-1 / ln(b)^2);

stationary point occurs when b = e.

(b) Since 2 < e < 3, it suffices to see which of 2*log_2(x) and
3*log_3(x) is smaller. But we have

3*log_3(x) = 3*log_2(x)/log_2(3) ~~ 1.893*log_2(x) < log_2(x),

so b*log_b(x) is minimum at the integral value b=3.

(c) Suppose n is the minimum point. Then for all b we have

(b+1)*log_b(x) = (b+1)*log_n(x)/log_n(b) >= (n+1)log_n(x)
=> b+1 >= (n+1)log_n(b)
=> n^(b+1) >= b^(n+1).

We can verify that this is the case for n=4:

b=2: 4^3 >= 2^5
b=3: 4^4 >= 3^5

What about the other b's? We return to calculus, using the fact that
the derivative

ln(x)/ln(b)^2 * (ln(b) - (b+1)/b)

is positive for b>=4 (whereas it isn't for b=2,3). This is because
ln(4) >= 5/4, and ln(b)-(b+1)/b is increasing (since ln(b) is
increasing and (b+1)/b is decreasing).


1.2.3-16
--------------------

sum_{j=0}^n jx^j is the sum of the terms in the following triangle:

x
x^2 x^2
x^3 x^3 x^3
x^4 x^4 x^4 x^4
x^5 x^5 x^5 x^5 x^5
...
x^n x^n x^n x^n x^n ... x^n

The columns are geometric progressions, whose respective sums are
x(x^n-1 / x-1),
x^2(x^(n-1)-1 / x-1),
x^3(x^(n-2)-1 / x-1),
...

The sum of these terms is
sum_{i=1}^n x^i(x^(n-i+1)-1 / x-1)
= 1/(x-1) [nx^(n+1) - sum_{i=1}^n x^n]
= 1/(x-1) [nx^(n+1) - x(x^n-1 / x-1)]
= 1/(x-1)^2 [nx^(n+1)(x-1) - x(x^n-1)]
= 1/(x-1)^2 [nx^(n+2) - nx^(n+1) - x^(n+1) + x]
= 1/(x-1)^2 [nx^(n+2) - (n+1)x^(n+1) + x]


1.2.3-29
--------------------

I will abbreviate the sum

sum_{i=0}^n sum_{j=0}^i sum_{k=0}^j a_ia_ja_k

as

(0in 0ji 0kj) a_ia_ja_k, or (0in 0ji 0kj) for short.

Let S be the above sum. Considering the 3! = 6 different ways to
interchange order of summation, we have

6S = (0in 0ji 0kj) + (0in 0ki kji) + (0jn jin 0kj)
   + (0jn 0kj jin) + (0kn kin kji) + (0kn kjn jin)

   (relabelling summand variables so they appear in the order ijk)
   = (0in 0ji 0kj) + (0in 0ji jki) + (0in ijn 0ki)
   + (0in 0ji ikn) + (0in ijn ikj) + (0in ijn jkn)

   = (0in 0ji) [(0kj) + (jki) + (ikn)]
   + (0in ijn) [(0ki) + (ikj) + (jkn)]

   (swapg i and j in the first term and interchange order of summation)
   = 2 (0in ijn) [(0ki) + (ikj) + (jkn)]

The term is a sum over all i,j,k satisfying
1. 0<=i<=j<=n
2. 0<=k<=i or i<=k<=j or j<=k<=n, where the k=i and k=j cases are counted twice

Thus we can rewrite it as

2 [(0kn) (0jin) a_ia_ja_k + (0ijn) (a_i^2 a_j + a_i a_j^2)],

and the first sum can be simplified as

(0kn) (0ijn) a_ia_ja_k
= [(0kn) a_k] [(0ijn) a_ia_j]
= 1/2 [S1(S1^2 + S2)],

where Sk = (0in) a_i^k. Now letting the second sum be T, we have

2T = (0in) (ijn) (a_i^2 a_j + a_i a_j^2)
   + (0jn) (0ij) (a_i^2 a_j + a_i a_j^2)

   = (0in) (ijn) (a_i^2 a_j + a_i a_j^2)
   + (0in) (0ji) (a_i^2 a_j + a_i a_j^2)
   (since the summand is symmetric under the swap (ij))

   = (0in) [(0jn) (a_i^2 a_j + a_i a_j^2) + 2 a_i^3]
   (the previous term is a sum over all 0<=i<=n and 0<=j<=n,
    with j=i counted twice)

   = (0in 0jn) a_i^2 a_j + (0in 0jn) a_i a_j^2 + 2 (0in) a_i^3
   = 2*S1*S2 + 2*S3.

Thus

6S = S1(S1^2 + S2) + 2*S1*S2 + 2*S3
   = S1^3 + 3*S1*S2 + 2*S3.


1.2.4-36
--------------------

sum_{k=1}^n floor(k/2)
= sum_{k=1}^n k/2 - sum_{k=1}^n (k/2 mod 1)
= n(n+1)/4 - (1/2 + 0 + 1/2 + ...)
= n^2/4 + n/4 - ceil(n/2)/2
= (n^2 - [n odd]) / 4

Now since (n^2-[n odd])/4 <= n^2/4 < (n^2-[n odd])/4 + 1, it is equal
to floor(n^2/4) as desired.

sum_{k=1}^n ceil(k/2)
= sum_{k=1}^n floor(k+1 / 2)
= floor((n+1)^2 / 4)

Knuth's expression is ceil(n(n+2)/4), but they are equal because
1. n(n+2)/4 and (n+1)^2/4 differ by less than 1
2. Either n(n+2) or (n+1)^2 is divisible by 4 and so one of the
expressions is an integer, in other words there is an integer between
n(n+2)/4 and (n+1)^2/4.


1.2.4-37
--------------------

sum_{k=1}^n floor((mk+x)/n)
= sum (mk+x / n) - sum [(mk+x / n) mod 1]
= m(n-1)/2 + x - 1/n sum (mk+x mod n) (*)

Let's first consider sum (mk mod n). The multiples of m generate an
additive subgroup of Z/nZ where the gap between successive elements
is d=(m,n). Thus it has n/d elements, and because we are taking n
multiples, the total sum is

d * (0 + d + 2d + ... + (n/d-1)d)
= d^2 * 1/2 n/d (n/d-1)
= nd(n/d-1)/2
= n(n-d)/2

Letting x' = x mod d, we then have

sum (mk+x mod n)
= d * (x' + (d+x') + (2d+x') + ... + ((n/d-1)d+x'))
= n(n-d)/2 + d(n/d)x'
= n[(n-d)/2 + (x mod d)].

Thus, we can continue from (*):

...
= m(n-1)/2 + x - (n-d)/2 - (x mod d)
= (m-1)(n-1)/2 + (d-1)/2 + x - x mod d
= (m-1)(n-1)/2 + (d-1)/2 + d floor(x/d).


1.2.4-42
--------------------

Part (a) is very easy to prove.

sum_{k=1}^n floor(log_b(k))
= n log_b(n) - sum_{k=1}^{n-1} k[floor(log_b(k+1)) - floor(log_b(k))]

Note that the bracketed term in the sum is 1 if k+1 is a power of b,
and 0 otherwise. Thus the sum can be simplified to

sum_{2<=b^l<n} (b^l-1)
= b + b^2 + ... + b^floor(log_b(x)) - floor(log_b(x)).

Putting it back into the previous expression we get

(n+1)log_b(n) - b(b^floor(log_b(x))-1 / b-1),

as desired.


1.2.4-48
--------------------

(a). floor(m+n-1 / n) = ceil(m/n) is not necessarily true for n<0
(e.g. n=-1). To prove it for n>0 it suffices to show that

1. (m+n-1)/n - m/n < 1 (clearly true)

2. There is always an integer between m/n and (m+n-1)/n, i.e. there is
always a multiple of n between m and m+n-1. This is clearly true as
well.


(b). By (a) the LHS can be rewritten ceil(n-floor(n/25) / 3). Then we
want to prove the equality

ceil(25n-25floor(n/25) / 75) = floor(24n+72 / 75),

where the denominators are made equal. We use the same method as
(a). Firstly, the difference is

1/75 (n-25floor(n/25) - 72),

and since 0 <= n-25floor(n/25) < 24 we have the bounds

-72/75 <= 1/75 (n-25floor(n/25) - 72) < -48/75,

and in particular the difference is always less than 1. Next, we want
to show that there always exists a multiple of 75 between

25n-25floor(n/25) = 24n + n-25floor(n/25) and 24n+72.

It suffices to compare their residues mod 75, so it suffices to check
this for n=0 to 74. In fact, it suffices to check n=0 to 24, because
24(25k+l) = 24l mod 75. But in the end I still needed to verify this
via brute force.

Actually Knuth's solution is a lot cleaner, as he derives the equality
simply by repeatedly applying previously proven formulas.


1.2.5-24
--------------------

Firstly, we have 1+x <= e^x for all real x, with equality at x=0. This
can be seen in two ways: firstly, one can use the Maclaurin series

e^x = 1 + x + (x^2/2) + (x^3/6) + ...

to immediately conclude that e^x > 1+x for x>0. Another way is to look
at derivatives: note that e^x>1 for x>0 and e^x<1 for x<0, where e^x
and 1 are the derivatives of e^x and 1+x respectively.

Subbing in x=1/k and -1/k in the inequality 1+x<= e^x gives the
following bounds for e^(1/k):

(k+1 / k) <= e^(1/k) <= (k / k-1).

A consequence of this is that

(n / n-k)
= (n / n-1)(n-1 / n-2)...(n-k+1 / n-k)
<= e^[(1 / n-1) + ... + (1 / n-k)].

Therefore,
n^n / n!
= (n / n-1)...(n / n-(n-1))
<= e^(n-1).

Similarly,

(n / n-k) >= e^[(1/n) + ... + (1 / n-k+1)],

and so n^(n+1)/n! >= e^(n-1).

(This time the inequality for k=n-1 is applied twice.)


1.2.6-10
--------------------

(a). Let p be a prime write n=ap+b where 0<=b<p. Then the numerator of (n p)
is the product

(ap + b) (ap + b-1) ... (ap + b-(p-1))

Exactly one of these terms is divisible by p, namely ap. On the other
hand the denominator is p!, so we can divide p from both numerator and
denominator. By Wilson's theorem, the new denominator is -1 mod p, and
the new numerator is a(p-1)! = -a mod p. Thus (n p) = -a/-1 = a mod p.


(b). If p is prime and 1<=k<=p-1, then the numerator of (p k) is 0 mod
p since one of the terms is p, while the denominator is nonzero mod p
since none of the terms are divisible by p. Thus (p k) = 0 mod p.


(c). The numerator of (p-1 k) is (p-1)(p-2)...(p-k) which is equal to
(-1)^k k! mod p. On the other hand, the denominator is k! mod p, so
(p-1 k) = (-1)^k mod p.


(d). Trivial since (p k) | (p+1 k).

--

(e). Write (n k) in terms of numerator and denominator:

(n k) = n(n-1)...(n-(k-1)) / k(k-1)...1

The number of multiples of p in numerator and denominator are

floor(n/p) - floor((n-k)/p)    and    floor(k/p)    respectively.

The first case is when n mod p < k mod p. Then we have

floor((n-k)/p) = floor(n/p) - floor(k/p) - 1

and so

floor(n/p) - floor((n-k)/p) = floor(k/p) + 1.

Thus there are more multiples of p in numerator than denominator, and
so (n k) = 0 mod p.

On the other hand, n mod p < k mod p implies (n mod p  k mod p) = 0 by
definition, so

(floor(n/p) floor(k/p)) (n mod p  k mod p) = 0 mod p.

The second case is where n mod p >= k mod p. Using (17 7) mod 3 as an
example, write the numerator and denominator out as follows:

   17 16             7
15 14 13   and   6 5 4     (multiples of 3 in the left column)
12 11            3 2 1.

We can divide off p from the first column, which preserves the value of
(n k) mod p.

  17 16             7
5 14 13   and   2 5 4
4 11            1 2 1.

In the right matrix, the top row is congruent to (k mod p)! while the
first column is floor(k/p)!. As for the remaining submatrix

5 4
2 1,

each row is congruent to -1 by Wilson's theorem, so overall the
denominator is congruent to

(1)  (-1)^floor(k/p) (k mod p)! floor(k/p)! mod p.

What about the numerator i.e. the left matrix? Similarly, the top row
is congruent to (n mod p)! while the first column is

floor(n/p)! floor((n-k)/p)!.

The remaining contributions come from the matrix

14 13
11.

We know that the topmost (floor(n/p) - floor((n-k)/p) - 1) rows are
congruent to -1.

As for the bottommost row, we can first fill in the missing entries,
then divide by the entries we added. Using this approach, the bottom
row is congruent to (-1)/(n-k mod p)! mod p. Overall, the numerator is
congruent to

sgn{floor(n/p) - floor((n-k)/p)}   *   (n mod p)! * floor(n/p)!
     /    ((n-k) mod p)! * floor((n-k)/p)!
mod p.

Since n mod p >= k mod p, we have

floor((n-k)/p) = floor(n/p) - floor(k/p)
and
(n-k) mod p = nmod p - kmod p.

So this expression can be further simplified to

(2)  sgn{floor(k/p)} * (n mod p)! * floor(n/p)!
        /    (n mod p - k mod p)! * (floor(n/p) - floor(k/p))!

Taking the quotient (2)/(1) gives us

(n k) = (n mod p  k mod p) (floor(n/p) floor(k/p))   mod p,

as desired.

--

(f). Follows easily from (e).


1.2.6-11
--------------------

Let mu(k) be the multiplicity of the prime factor p in the factorial
k!, and s(k) be the sum of digits in k! under the p-ary number
system. It was proven in a previous exercise that

mu(k) = (k-s(k))/(p-1).

Now since (a+b a) = (a+b)! / a!b!, we have

mu((a+b a)) = mu(a+b) - mu(a) - mu(b) = (s(a)+s(b)-s(a+b)) / (p-1).

Now consider the numbers a=17 and b=16, which are written as (1,2,2)
and (1,2,1) respectively in the 3-ary number system. I will
demonstrate adding a to b:

(1,2,2)+(1,2,1)
= (2,4,3)  (digit sum = s(a)+s(b) = 9)
= (3,1,3)  (carry; digit sum = s(a)+s(b)-(3-1) = 7)
= (1,0,1,3)  (carry; digit sum = s(a)+s(b)-2(3-1) = 5)
= (1,0,2,0)  (carry; digit sum = s(a)+s(b)-3(3-1) = 3)

Note how each carry decreases the digit sum by p-1. Thus

number of carries = s(a)+s(b)-s(a+b) / p-1.


1.2.6-12
--------------------

Mod 2, every (n k) can be written as a product of (a b)'s, where a,b
in {0,1}. For instance, we have

14 = 1*2^3 + 1*2^2 + 1*2^1 + 0*2^0
and
7  = 0*2^3 + 1*2^2 + 1*2^1 + 1*2^0,

so

(14 7) = (1 0)(1 1)(1 1)(0 1) mod 2.

Out of the four possibilities for (a b), only (0 1) = 0. So we want to
prevent (0 1) from appearing in the RHS for all 0<=k<=n. That only
happens when n has the form

2^0 + 2^1 + 2^2 + ... + 2^k

The first few such values are 1, 3, 7, 15.


1.2.6-16
--------------------

By symmetry, (k+n-2 n-1) = (k+n-2 k-1). Now we have

(-1)^n (-n k-1)
= (-1)^n       (-n)(-n-1)...(-n-(k-2)) / (k-1)(k-2)...1
= (-1)^(n+k-1) n(n+1)...(n+k-2) / (k-1)(k-2)...1
= (-1)^(n+k-1) (n+k-2 k-1).

Similarly, we have
(-1)^k (-k n-1) = (-1)^(n+k-1)(n+k-2 k-1).


1.2.6-18
--------------------

(r+s choose r-m+n)
= sum(k)  (r (r-m+n)-k) (s k)
= sum(k') (r (r-m+n)-(n+k')) (s n+k')
= sum(k') (r r-m-k') (s n+k')
= sum(k') (r m+k') (s n+k').


1.2.6-19
--------------------

The first, third and fourth steps are probably wrong because the
identities (19) and (22) require that s be an integer, which it
isn't necessarily. This might be why induction on r is needed; I
haven't checked the answer key yet. Regardless the general idea should
be to bring k down from the upper index to the lower index, so we
can exploit previous summation identities.

sum(k) (r k) (s+k n) (-1)^(r-k)
= sum(k) (r k) (-n-1 s+k-n) (-1)^((s+k-n)+(r-k))   by (19)
= (-1)^(r+s-n) * sum(k) (r k) (-n-1 s-n+k)
= (-1)^(r+s-n) * (r-n-1 r+s-n)    by (22)
= (s n-r)   by (19).


1.2.6-20
--------------------

sum(k=0 to r) (r-k m) (s k-t) (-1)^(k-t)
= sum(k=0 to r) (-m-1 r-k-m) (s k-t) (-1)^((r-k-m)+(k-t))    by (19)
= (-1)^(r-m-t) * sum(k-t=-t to r-t) (-m-1 r-m-t-(k-t)) (s k-t)
= (-1)^(r-m-t) * (s-m-1 r-m-t)   by (21)
= (s-m-1 r-m-t)   by (17).

Note the application of (21) is legitimate since k-t runs from -t
to r-t, and in particular it runs from 0 to r-m-t. We also have
sum(k=0 to r) (r-k m) (s+k n)
= sum(k=0 to r) (r-k m) (-n-1 k-(n-s)) (-1)^(k-(n-s))
= (r-(n-s)-(-n-1)  r-(n-s)-m)   by (24)
= (r+s+1 r-n+s-m)
= (r+s+1 m+n+1)   by symmetry.


1.2.6-21
--------------------

For reference, the equation in question is

sum(k=0 to r) (r-k m)(s+k n) = (r+s+1 m+n+1),

where n, s, m and r are nonnegative integers and n >= s. If n, m and r
are fixed, then LHS and RHS agree for only n+1 values of s since n >=
s >= 0. But LHS is a sum of degree n polynomials and hence a degree n
polynomial itself, while RHS is a degree m+n+1 polynomial. Thus there
is insufficient proof that LHS and RHS agree on all points.


1.2.6-22
--------------------

Mildly funny story behind this question: I tried bashing it out for a
week until I gave up and peeked at the solution. Then I realised it
made use of material I hadn't even gone through yet!

For reference, the equation is

sum(k>=0) (r-tk k) (n-1-r+tk n-k) (r / r-tk) = (r+s-tn n),

where n is an integer. The key trick is to use the inversion identity
(34):

(sum(k>=0) (r-tk k) (n-1-r+tk n-k) (r / r-tk)
= sum(k>=0) [ (r/n!)(n k) *
     prod(0<j<k) (r-tk-j) * prod(0<=j<n-k) (n-1-r+tk-j) ]
= (r/n!) * sum(k>=0) [ (n k) (-1)^(n-k)
     prod(0<j<k) (-tk+r-j) * prod(0<=j<n-k) (-tk+j+r+1-n) ]
= (r/n!) * sum(k>=0) [ (n k)(-1)^(n-k) * (deg n-1 polynomial in k) ]
= 0   by (34).


1.2.6-23
--------------------

For reference, the equation is

sum(k>=0) (r-tk k) (s-t(n-k) n-k) (r / r-tk) = (r+s-tn n).

This is a straightforward application of the addition formula:

sum(k>=0) (r-tk k) ((s+1)-t(n-k)  n-k) (r / r-tk)
= sum(k>=0) (r-tk k) (s-t(n-k) n-k) (r / r-tk) +
  sum(k>=0) (r-tk k) (s-t(n-k) (n-1)-k) (r / r-tk)
  by addition formula
= (r+s-tn n) +
    sum(k>=0) (r-tk k) . ((s-t)-t((n-1)-k)  (n-1)-k) . (r / r-tk)
= (r+s-tn n) + (r+(s-t)-t(n-1)  n-1)
= (r+s-tn n) + (r+s-tn n-1)
= (r+(s+1)-tn n)   by addition formula.


1.2.6-24
--------------------

I don't get this problem. How is the identity true for all r, s and t
and all integer n, purely from that `inductive step' and the base case
s=n-1-r+nt alone?  Regardless, note that (26) is a *massive*
generalisation of (21) with the extra terms tk and t(n-k).


1.2.6-30
--------------------

For reference, Example 3 is the sum

sum(k) (n+k m+2k) (2k k) (-1)^k / k+1

where m and n are positive integers. To transform it into a form
suitable for (26), one only needs to compare the denominator k+1 with
the corresponding expresion r-tk. This suggests t=-1 and r=1 --- the
actual substitutions made are t=1 and r=-1, so the derivation is
smoother. The second clue is to transform the lower indices m+2k and k
into something resembling n-k and k. The last clue is to get rid of
the (-1)^k via (17).

sum(k) (n+k m+2k) (2k k) (-1)^k / k+1
= sum(k) (n+k n-m-k) (-k-1 k) / k+1
  by symmetry and (17)
= sum(k) (-1-k k) . ((2n-m)-((n-m)-k)  (n-m)-k) . -1 / -1-k
  [r'=-1, t'=1, s'=2n-m, n'=n-m]
= (n-1 n-m)   by (26)
= (n-1 m-1).


1.2.6-31
--------------------

sum(k) (m-r+s k) (n+r-s n-k) (r+k m+n)
= sum(j,k) (m-r+s k) (k j) (n+r-s n-k) (r m+n-j)
  by hint
= sum(j,k) (m-r+s j) (m-r+s-j k-j) (n+r-s n-k) (r m+n-j)
  by (20)
= sum(j) ( (m-r+s j) (r m+n-j)
           sum(k) (m-r+s-j k-j) (n+r-s n-j-(k-j)) )
= sum(j) (m-r+s j) (r m+n-j) (m+n-j n-j)
  by (21)
= sum(j) (m-r+s j) (r m+n-j) (m+n-j m)
  by symmetry
= sum(j) (m-r+s j) (r m) (r-m n-j)
  by (20)
= (r m)(s n) by (21).


1.2.6-33
--------------------

(I write x^k` to denote a rising power.)

Proceed by induction. Base case is trivially true, so suppose identity
holds for n. Then we have

(x+y)^(n+1)`
=(x+y+n) sum(k) (n k) x^k` y^(n-k)`.

Note that by the identity

x x^k` = x^(k+1)` - k x^k`,
we have

x sum(k) (n k) x^k` y^(n-k)`
= sum(k) (n k) x^(k+1)` y^(n-k)` - k sum(k) (n k) x^k` y^(n-k)`

and similarly
y sum(k) (n k) x^k` y^(n-k)`
= sum(k) (n k) x^(k+1)` y^(n-k+1)` - (n-k) sum(k) (n k) x^k` y^(n-k)`

Thus

(x+y+n) sum(k) (n k) x^k` y^(n-k)`
= sum(k) (n k) x^(k+1)` y^(n-k)`
  + sum(k) (n k) x^k` y^(n-k+1)`.

Transforming the sum index and applying the addition formula
simplifies it to

sum(k) (n+1 k) x^k` y^(n-k)`.

as desired.


1.2.6-35
--------------------

(I write x^k@ to denote a falling power.)

The Stirling numbers of the first kind have a nice
interpretation. Recall they are defined by the formula

x(x+1)...(x+n-1)
= [n n]x^n + [n n-1]x^(n-1) + ... + [n 0].

From this obtain a formula for the kth coefficient:

[n k] = sum(a1,...,a(n-k)) a1...a(n-k),

where the ai's are distinct numbers in {1,...,n-1}.

Every possible set of ai's in {1,...,n} falls under two types: those
that contain n and those that don't. The identity

[n+1 k] = n[n k] + [n k-1]

easily follows from this fact.

As for Stirling numbers of the second kind, a more manual approach is
used. Recall that they are defined by the formula

x^n = sum(k) {n k} x^k@

Multiplying through by x gives

x^(n+1)
= sum(k) {n k} x * x^k@
= sum(k) {n k} [x^(k+1)@ + k x^k@]
= sum(k) {n k} x^(k+1)@  +  sum(k) {n k} k x^k@
= sum(k') {n k'-1} x^k'@ +  sum(k) {n k} k x^k@
= sum(k) [{n k-1} + k{n k}] x^k@

At the same time we have

x^(n+1) = sum(k) {n+1 k} x^k@

by definition, so

{n+1 k} = k{n k} + {n k-1}.


1.2.6-36
--------------------

sum(k)(n k) is equal to 2^n, the total number of subsets of
{1,...,n}. The alternating sum (sum(k) (-1)^k (n k)) can be evaluated
using the addition formula:

sum(k) (n k)(-1)^k
= sum(k) [(n-1 k) + (n-1 k-1)] (-1)^k
= sum(k) (n-1 k)(-1)^k + sum(k) (n-1 k-1)(-1)^k
= sum(k) (n-1 k)(-1)^k + sum(k') (n-1 k')(-1)^(k'+1)
= 0.

Note that the result is also immediate for even n due to symmetry.


1.2.6-37
--------------------

sum(k even)(n k)
= 1/2 [sum(k)(n k) + sum(k)(-1)^k(n k)]
= 2^(n-1).


1.2.6-40
--------------------

(a). Follows easily by substituting t=1-u.

(b). This is analogous to the addition formula for binomial
coefficients. Also follows easily by elementary manipulations.

(c). I prove the equivalent statement

B(x+1,y) = x/y B(x,y+1)

using integration by parts:

B(x+1,y)
= int(0 to 1) t^x (1-t)^(y-1) dt
= (-(1/y) 1^x (1-1)^y)
  - (-(1/y) 0^x (1-0)^y)
  + (x/y) * int(0 to 1) t^(x-1) (1-t)^y dt
= x/y B(x,y+1).


1.2.6-41
--------------------

For reference,

Gamma_m(x) = m^x m! / x(x+1)(x+2)...(x+m)

and

B(x,y+1) = (y / x+y) B(x,y).

The statement Gamma_m(x) = m^x B(x,m+1) follows immediately by
repeated application of the latter formula.


(a). Note that

B(x,y) / B(x,y+m+1) = (x+y / y) * (x+y+1 / y+1) ... (x+y+m / y+m).

The RHS is also easily seen to be the latter expression.


(b). Since (a) holds for all m, the limit

lim(m->infty) (Gamma_m(y) m^x / Gamma_m(x+y)) * B(x,y+m+1)

exists. It is equal to

Gamma(y)/Gamma(x+y) lim(m->infty) m^x B(x,m+1),

which in turn is equal to Gamma(x)Gamma(y)/Gamma(x+y). Note how y+m+1
was substituted for m+1, because both approach infty as m->infty.


Motivation for 1.2.6-40,41: Beta distribution
--------------------

The Beta distribution Beta(a,b) is parametrised by positive reals
a,b. It is supported on [0,1] and its pdf is given by

x^(a-1)(1-x)^(b-1) / B(a,b).

Suppose you have a (possibly unfair) coin, and you want to determine
the probability p of landing heads based on a sample of heads and
tails. Using the Bayesian approach, we can model p by the prior
distribution Beta(a,b). The specific values of a and b depend on our
prior assumptions: if we initially assume that the coin is fair, we
might set a=b=1. This yields a bell-shaped distribution symmetric
about x=0.5. We could also set a=b=2 instead, which will also yield a
symmetric distribution, but with lower variance.

If we assume instead that the coin is skewed towards heads, we set
a>b. In general, a and b represent the `initial number' of heads and
tails to factor into the prior.

By Bayes' theorem, we then calculate the posterior distribution after
a sample of m heads and n tails. It turns out to be another Beta
distribution Beta(a+m,b+n)! This statement has a very nice
interpretation: we are factoring in m extra heads and n extra tails
into the distribution after the sample. Since the prior and posterior
distributions are of the same type, we say that Beta(a,b) is a
conjugate prior.


1.2.6-42
--------------------

This is a trivial matter of replacing the factorials with Gamma's:

r!/ k!(r-k)!
= Gamma(r+2) / (r+1)Gamma(k+1)Gamma(r-k+1)
= B(k+1,r-k+1) / r+1.


1.2.6-54
--------------------

The ij-entry of Pascal's triangle as a matrix is C(i,j).
I claim that the ij-entry of the inverse is (-1)^{i+j}C(i,j).

For the ij-entry of their product is

sum_k (-1)^{k+j} C(i,k) C(k,j)
= C(i,j) sum_k (-1)^{k+j} C(i-j,k-j)  (Eq 20)
= C(i,j) sum_k' (-1)^l C(i-j,l)
= C(i,j) (1-1)^{i-j}  (Binomial theorem)
= [i=j]


1.2.6-56
--------------------

I list out the coefficients C(n,1), C(n,2) and C(n,3) as n varies:

n  C(n,1) C(n,2) C(n,3)
0       0      0      0
1       1      0      0
2       2      1      0
3       3      3      1
4       4      6      4
5       5     10     10
6       6     15     20
7       7     21     35
...   ...    ...    ...

Suppose we have a number k = C(i,1)+C(j,2)+C(k,3) = (i,j,k) where
i<j<k. Then the easiest way to represent k+1 would be (i+1,j,k), which
would work if i+1<j. But if i+1=j then this is not a valid
representation, though we can try out an alternative representation
(0,j+1,k) via the addition formula. Again this works only if j+1<k,
and if j+1=k we need to use yet another alternate representation
(0,1,k+1).

We just described an algorithm for determining the representation of
k+1 given the representation of k. And obviously the representation of
0 is (0,1,2), so we are done. For completeness I'll list out the
representations of some small numbers:

 0 = 0+0+0,   1 = 0+0+1,   2 = 0+1+1,   3 = 1+1+1,
 4 = 0+0+4,   5 = 0+1+4,   6 = 1+1+4,   7 = 0+3+4,
 8 = 1+3+4,   9 = 2+3+4,  10 = 0+0+10, 11 = 0+1+10,
12 = 1+1+10, 13 = 0+3+10, 14 = 1+3+10, 15 = 2+3+10,
16 = 0+6+10, 17 = 1+6+10, 18 = 2+6+10, 19 = 3+6+10,
20 = 0+0+20, 21 = 0+1+20, 22 = 1+1+20, 23 = 0+3+20


1.2.6-60
--------------------

Let <n k> to denote the number of k-combinations out of n-objects with
repetitions. First, it helps to look at small values of k:

1. Clearly there are only (n 1)=n 1-combinations.

2. There are (n 2)+(n 1) 2-combinations. Either the 2-combination has
two distinct elements (like ab) so there are (n 2) of them, or they
are the same element repeated twice (like aa), so there are (n 1) of
them.

3. There are (n 3)+2(n 2)+(n 1) 3-combinations. The first and last
terms handle the cases where the 3-combination has 3 distinct
elements, or is the same element repeated thrice. The interesting bit
is when there are 2 distinct elements, like a and b. When a and b are
given, we can form 2 distinct 3-combinations, namely aab and abb. So
that gives us 2(n 2) distinct 3-combinations with 2 distinct elements.

By now, a clear pattern should emerge: the coefficients appearing in
<n k> are binomial coefficients themselves! Now let's derive <n k> in
general.

Let Aki be the number of distinct k-combinations we can form given i
letters a,b,..., such that each letter appears at least once.  We can
check that it satisfies the following relations:

Ak1 = 1,  Aki = 0 for k<i,
Aki = A(k-1)(i-1) + A(k-2)(i-1) + ... + A1(i-1) for k>=i.

These relations are enough to establish that Aki = (k-1 i-1). Thus

<n k>
= sum(1<=i<=n) Aki (n i)
= sum(1<=i<=n) (k-1 k-i)(n i)
= (n+k-1 k).

--

Side note: there's an alternative solution using a bijection. Given n
boxes, you can allocate k identical balls into the boxes to obtain a
configuration. For example, with n=4 and k=3 we might have the
configuration

oo _ _ o

which corresponds to aad. Each configuration is identical to putting
n-1 separators between each of the k balls:

oo | | | o

There are n+k-1 separators and objects in total, corresponding to
n+k-1 slots. At each slot, one can insert either a separator or an
object, yielding (n+k-1)! permutations. But since all separators and
all objects are identical, the true number of configurations is
(n+k-1)! / (n-1)!k! = (n+k-1 k).


1.2.6-64
--------------------

To establish that {n m} is the number of m-subset partitions of
{1,...,n}, we handle the boundary conditions first:

{n 1} = {n n} = 1, {n 0} = 0.

Now, each partition of {1,...,n+1} into m subsets fall under two
cases: either {n+1} is a subset by itself, or it is contained in a
larger set in the partition. The former case gives us {n m-1}
partitions. As for the latter case, note that every m-subset partition
of {1,...,n} induces m possible partitions of {1,...,m+1} by adding
n+1 to any one of the m partition sets. Thus this contributes
m{n m} partitions. This establishes that the relation

{n+1 m} = m{n m} + {n m-1}.


1.2.6-67
--------------------

Using the inequality

k! >= k^k / e^(k-1)

which was proved in 1.2.5-24, we get

(n k)
= n(n-1)...(n-k+1) / k!
<= n^k e^(k-1) / k^k
< (ne/k)^k.


1.2.7-3
--------------------

H(m,r) <= 1 + (1 - 2^(m(1-r))) / (1 - 2^(1-r)),
thus
H(infty,r) <= 1 + 1/(1-2^(1-r)).


1.2.7-5
--------------------

True to the spirit of the exercise, I will rawdog the computation
without using any calculator or computer program.

Let n=10^4 and recall that

Hn = ln n + gamma + 1/(2n) + 1/(12n^2) - 1/(120n^4) - e,   0 < e < 1/(252n^6)

We can ignore e and 1/(120n^4) because
1/(252n^6) < 1/(120n^4) < 1/(10^18)
is well within the allowance of 15 decimal places. Now to do the
actual computation (using 20 decimal places just to be safe):

ln 10          2.30258 50929 94045 68401
             *                         4
             ---------------------------
ln 10000       9.21034 03719 76182 73604
gamma        + 0.57721 56649 01532 86060
             ---------------------------
	       9.78755 60368 77715 59664
1/(2n)       + 0.00005 00000 00000 00000
             ---------------------------
	       9.78760 60368 77715 59664
1/(12n^2)    - 0.00000 00008 33333 33333
             ---------------------------
H10000         9.78760 60360 44382 26331.


1.2.7-6
--------------------

One can directly apply the formulas

[n+1 2] = n[n 2] + [n 1],  [n 1] = (n-1)!;

but I'll instead derive the solution using Knuth's definition

(x-0)(x-1)...(x-n) = [n+1 n+1]x^n - [n+1 n]x^(n-1) + ... +- [n+1 2]x^2 -+ [n+1 1]x +- [n+1 0].

Comparing coeffs, we see that [n+1 2] is the sum over products of size
n-1 subsets of {0,1,...,n}. This is equal to the sum over products of
size n-1 subsets of {1,...,n}, which is equal to

sum_(1<=k<=n) n!/k.

Thus Hn = [n+1 2]/n! as expected.


1.2.7-7
--------------------

(a) By symmetry and induction we only need to prove that T(m+1,n) <= T(m,n).

Indeed, the difference of RHS and LHS is

1/(mn+1) + 1/(mn+2) + ... + 1/(m+1)n   - 1/(m+1)
>= n * 1/(m+1)n - 1/(m+1)
= 0.

The maximum value of T(m,n) is T(1,1)=1. There is no minimum value,
but I believe the limit of T(m,n) as m,n->infty is gamma, because for
suff. large m,n we have

Hm approx log m + gamma,
Hn approx log n + gamma,
H(mn) approx log m + log n + gamma.


1.2.7-9
--------------------

Expand (n k) = (n k-1) + (n-1 k-1) and observe that a lot of
cancellation happens. The answer should be -1/n.


1.2.7-12
--------------------

1/2^1000 is 0 up to 1000*log10(2) ~~ 300 places.

Thus summing 1/2^1000 + 1/3^1000 + ... will clearly have no impact on
the 100th decimal place. So H(infty,1000) = 1.000..0 up to 100 decimal
places.


1.2.7-13
--------------------

Write (n k) = (n k-1) + (n-1 k-1).


1.2.7-15
--------------------

Write (k1n) for sum(k=1 to n). Then

(k1n) Hk^2
= (k1n)(l1k) [1/l (m1k) 1/m]
= (l1n) [1/l (kln)(m1k) 1/m]
(the magic happens here!)
= (l1n) [1/l * (m1n) (k max(l,m) n) 1/m]
= (l1n) [(m1l)(kln) 1/m + (m l+1 n)(kmn) 1/m] / l
= (l1n) [(n-l+1)Hl + (n+1)(Hn-Hl) - (n-l)] / l
= ...  (straightforward simplifications)
= (n+1) Hn^2 - (2n+1) Hn + 2n.


1.2.7-18
--------------------

From numerical computations I have observed that the 2-valuation of
sum(i=1 to 2n-1 odd) 1/i  (*)
is equal to 2*(2-valuation of n), and we just need to prove this.

Let n=m*2^k where m is odd. Define

P(l) = prod(1<=i<=2^(k+1)-1 odd) l*2^(k+1)+i
S(l) = sum(1<=i<=2^(k+1)-1 odd) P(l) / (l*2^(k+1)+i)
= sum(i<=i<=2^(k+1)-1 odd) prod(j, j!=i) l*2^(k+1)+j.

Multiplying (*) through by (2n-1)!!, the resulting numerator is
S(0) prod(0<=l<m, l!=0) P(l)
+ S(1) prod(0<=l<m, l!=1) P(l)
...
+ S(m-1) prod(0<=l<m, l!=m-1) P(l)  (**)

The 2-valuation remains the same if we were to express (*) as a
simplified fraction, because the denominator (2n-1)!! is odd. Thus it
suffices to prove the 2-valuation of (**) is 2k.

My argument proceeded by establishing three facts:
1. S(0) = 2^(2k) mod 2^(2k+1).
2. P(l) = P(0) = odd mod 2^(2k+1).
3. S(l) = S(0) mod 2^(2k+1).

Thus for odd m, (**) is equal to
m P(0)^(l-1) S(0) = 2^(2k) mod 2^(2k+1),
which implies (**) is divisible by 2^(2k) but not 2^(2k+1), as
desired.

---

The proof for 1 was kindly provided by a friend on Discord; it had the
clever idea of rewriting

S(0)
= P(0) sum(1<=i<=2^(k+1)-1 odd) 1/i
= P(0) sum(1<=i<=2^k-1 odd) [1/i + 1/(2^(k+1)-i)]
= P(0) 2^(k+1) sum(1<=i<=2^k-1 odd) 1/i(2^(k+1)-i),

Mod 2^k, the sum simplifies to -sum(i)1/i^2, where 1/i^2 is now viewed
as the inverse of i^2 in (Z/2^kZ)*. This is equal to sum(i)i^2 by a
bijection of the terms, and by induction this sum is 2^(k-1) mod
2^k. Multiplying by P(0) 2^(k+1) gives 2^(2k) mod 2^(2k+1).

The idea for 2 is to expand
(l*2^(k+1) + 1) ... (l*2^(k+1) + 2^(k+1)-1)
as a polynomial in l*2^(k+1), and then observe that (l*2^(k+1))^2 and
higher powers are killed mod 2^(2k+1). Thus it is equal to

P(0) + l*2^(k+1) S(0)
= odd + l*2^(k+1) 2^(2k)
= odd mod 2^(2k+1).

3 uses the same basic idea: each term of S(l) (recall it is P(l) with
one factor missing) is expanded as a polynomial in l*2^(k+1), and
summing results mod 2^(2k+1) gives

S(0) + l*2^(k+1) sum(1<i<=2^(k+1)-1 odd) sum(1<=j<=2^(k+1)-1 odd, j!=i) P(0)/ij

I show via a tedious argument that the double sum is 0 mod 2^k, thus
the whole thing is S(0) mod 2^(2k+1).


1.2.7-19
--------------------
See integer-harmonic-numbers.pdf


1.2.7-21
--------------------

Let the sum be Sn. Then

Sn
= sum(i+j<=n+1) 1/(ij)
= sum(i+j<=n) 1/(ij) + sum(i+j=n+1) i/(ij)
= sum(k=1 to n-1) Hk/(n-k) + sum(k=1 to n) 1/k(n+1-k)
= S(n-1) + 1/(n+1) sum(k=1 to n)[1/k + 1/(n+1-k)]
= S(n-1) + 2Hn/(n+1).

Thus
Sn
= 2 sum(k=1 to n) Hk/(k+1)
= 2 sum(k=1 to n) [H(k+1) - 1/(k+1)] / (k+1)
...
= H(n+1)^2 - H(n+1,2).


1.2.7-22
--------------------

Let the sum be Sn. Then

Sn
= sum(k=0 to n) Hk H(n-k)
= sum(k=1 to n-1) Hk H(n-k)
= sum(k=1 to n-1) Hk H(n-k-1) + sum(k=1 to n-1) Hk / n-k
= S(n-1) + 2 sum(k=1 to n-1) Hk/(k+1)   (from previous exercise)

Thus
Sn
= 2 sum(k=1 to n) sum(l=1 to k-1) Hl/(l+1)
= 2 sum(l=1 to n-1) sum(k=l+1 to n) Hl/(l+1)
= 2 sum(l=1 to n-1) (n+1 - (l+1))/(l+1) Hl
= 2(n+1) sum(l=1 to n-1) Hl/(l+1)   -   2 sum(l=1 to n-1) Hl
...
= (n+1)(Hn^2 - H(n,2)) - 2(nH[n-1] + 1 - n)
= (n+1)(Hn^2 - H(n,2)) - 2n(H[n] - 1).