If you find any mistakes, please make a comment! Thank you.

## Chapter 8 Exercise B

1. Solution: By 8.21 (a), $V = G(0, N)$. Since $G(0, N) = \operatorname{null} N^{\operatorname{dim} V}$ (see 8.11), it follows that $N^{\operatorname{dim} V} = 0$ and so $N$ is nilpotent.

2. Solution: Define $T \in \mathcal{L}(\mathbb{R}^3)$ by
$$T(x, y, z) = (0, -z, y).$$ That is, $T$ squashes vectors onto the $yz$ plane and rotates them counterclockwise by $\pi/2$ radians. So all eigenvectors of $T$ are contained in the $x$-axis and correspond to the eigenvalue $0$. $T$ obviously is not nilpotent.

3. Solution: Suppose $\lambda$ is an eigenvalue of $T$ and $v \in V$ a corresponding eigenvalue. $S$ is surject, so there exists $u \in V$ such that $Su = v$. Then
$$S^{-1}TSu = S^{-1}Tv = \lambda S^{-1}v = \lambda u,$$ which shows that $\lambda$ is an eigenvalue of $S^{-1}TS$. Hence every eigenvalue of $T$ is an eigenvalue of $S^{-1}TS$. We will prove these eigenvalues have the same multiplicity and it will follow that $S^{-1}TS$ cannot have other eigenvalues (by 8.26).

Suppose $\lambda_1, \dots, \lambda_m$ are the distinct eigenvalues of $T$. Fix $k \in \{1, \dots, m\}$. Let $v_1, \dots, v_d$ be a basis of $G(\lambda_k, T)$. There exist $u_1, \dots, u_d \in V$ such that $Su_j = v_j$ for each $j = 1, \dots, d$. It easy to check that the $u$’s are linearly independent. We have
$$G(\lambda_k, S^{-1}TS) = \operatorname{null} (S^{-1}TS – \lambda_k I)^{\operatorname{dim} V} = \operatorname{null} S^{-1}(T – \lambda_k I)^{\operatorname{dim} V}S,$$ where the first equality comes from 8.11 and the second from Exercise 5 in section 5B. For each $j$, we have
$$S^{-1}(T – \lambda_k I)^{\operatorname{dim} V}Su_j = S^{-1}(T – \lambda_k I)^{\operatorname{dim} V}v_j = 0,$$ where the second equality follows because $v_j \in G(\lambda_k, T)$. This shows that $u_1, \dots, u_d \in G(\lambda_k, S^{-1}TS)$. Hence
$$\operatorname{dim} G(\lambda_k, S^{-1}TS) \ge d = \operatorname{dim} G(\lambda_k, T). \tag{1}$$ By 8.26, we must have
$$\operatorname{dim} G(\lambda_1, T) + \dots + \operatorname{dim} G(\lambda_m, T) = \operatorname{dim} V \tag{2}$$ and
$$\operatorname{dim} G(\lambda_1, S^{-1}TS) + \dots + \operatorname{dim} G(\lambda_m, S^{-1}TS) \le \operatorname{dim} V. \tag{3}$$ $(1)$ and $(2)$ imply that $(3)$ is only possible if $$\operatorname{dim} G(\lambda_k, S^{-1}TS) = \operatorname{dim} G(\lambda_k, T).$$ Hence, their multiplicieties are the same and $S^{-1}TS$ cannot have other generalized eigenspaces (the ones shown here already eat up the dimension of $V$).

4. Solution: By the same reasoning used in the proof of 8.4, it follows that $\operatorname{dim} \operatorname{null} T^{n-1} \ge n – 1$. But $\operatorname{null} T^{n-1} \subset \operatorname{null} T^n = G(0, T)$ (see 8.2 and 8.11). Thus $\operatorname{dim} G(0, T) \ge n – 1$ and $0$ is an eigenvalue of $T$.

If $\operatorname{dim} G(0, T) = n$, 8.26 shows that $0$ is the only eigenvalue of $T$. If $\operatorname{dim} G(0, T) = n – 1$, there is only space for one more eigenvalue with multiplicty $1$.

5. Solution: Every eigenvector is also a generalized eigenvector, so in the forward direction we can make a similar argument to that of Exercise 3, where the dimensions only fit if each eigenspace equals its corresponding generalized one (because the former is a subset of the latter). The other direction is obvious from 8.23.

6. Solution: The formula for $N$ doesn’t really matter here. We only care about the dimension of $\mathbb{F}^{5}$, which is $5$. Using the same reasoning from the proof of 8.31, and because $N^j = 0$ for $j \ge 5$, we have
\begin{aligned} I + N &= (I + a_1N + a_2N^2 + a_3N^3 + a_4N^4)\\ &= I + 2a_1N + (2a_2 + a_1^2)N^2 + (2a_3 + 2a_1a_2)N^3 + (2a_4 + 2a_1a_3 + a_2^2)N^4 \end{aligned} for some $a_1, a_2, a_3, a_4 \mathbb{F}$.

Choose
$$a_1 = \frac{1}{2},\quad a_2 = \frac{-1}{8},\quad a_3 = \frac{1}{16}\quad a_4 = \frac{-5}{128}$$ and the terms on the second line will collapse to $N + I$. Hence
$$I + \frac{1}{2}N – \frac{1}{8}N^2 + \frac{1}{16}N^3 – \frac{5}{128}N^4$$ is a square root of $N + I$.

7. Solution: One can use the same strategy as in the proof of 8.31 to show that $I + N$ has a cube root for any nilpotent $N \in \mathcal{L}(V)$ and the rest of the proof will be same as the proof of 8.33.

8. Solution: If $0$ is not an eigenvalue of $T$, then $T^j$ is injective and surjective for all integers $j$, which gives the desired result (take $j = n-2$).

Suppose $0$ is an eigenvalue of $T$. Since $3$ and $8$ are also eigenvalues of $T$, by 8.26 the multiplicity of $0$, namely $\operatorname{dim} G(0, T)$ which equals $\operatorname{dim} \operatorname{null} T^n$, is at most $n – 2$. By the same reasoning used in the proof of 8.4, we have $\operatorname{null} T^{n-2} = \operatorname{null} T^n$ (because the $\operatorname{null} T^{n-2} \subset \operatorname{null} T^n$). Exercise 19 of section 8A implies that $\operatorname{range} T^{n-2} = \operatorname{range} T$. Now 8.5 completes the proof.

9. Solution: Keep in mind that when we mention the size of an $n$-by-$n$ matrix here we mean $n$ and not $n$ times $n$.

Let $V$ be vector space whose dimension equals the size $A$ (or $B$, since they’re the same). Choose a basis of $V$ and define $S, T \in \mathcal{L}(V)$ such that $\mathcal{M}(S) = A$ and $\mathcal{M}(T) = B$. Then $\mathcal{M}(ST) = AB$.

Let $d_j$ equal the size of $A_j$ (or $B_j$, because they’re the same). Consider the list consisting of the first $d_1$ vectors in the chosen basis. $A$ and $B$ show that the span of these vectors are invariant under $S$ and $T$.

Similarly, the span of the next $d_2$ vectors after this list is also invariant under $T$. Continuing in this fashion, we see that there are $m$ distinct lists of consecutive vectors, with no intersections, in the chosen basis whose spans are invariant under $S$ and $T$.

Let $U_1, \dots, U_m$ denote such spans. Clearly $\mathcal{M}(S|_{U_j}) = A_j$ and $\mathcal{M}(T|_{U_j}) = B_j$ for each $j$. Hence $\mathcal{M}(S|_{U_j}T|_{U_j}) = A_jB_j$ and so it easy to see that $\mathcal{M}(ST)$ (which equals $AB$) has the desired form.

10. Solution: Let $v_1, \dots, v_n$ denote a basis of $V$ consisting of generalized eigenvectors of $T$ (which exists by 8.23). Define $\langle \cdot, \cdot \rangle: V \times V \to \mathbb{C}$ by
$$\langle a_1 v_1 + \dots + a_n v_n, b_1 v_1 + \dots + b_n v_n \rangle = a_1\overline{b_1} + \dots + a_n\overline{b_n},$$ where the $a$’s and $b$’s are complex numbers. You can check that $\langle \cdot, \cdot \rangle$ is a well defined inner product on $V$. Thus $v_1, \dots, v_n$ is an orthonormal basis of $V$. Moreover, the generalized eigenspaces of $T$ are orthogonal to each other. This implies that, if $v \in G(\beta, T)$, then
$$P_{G(\alpha, T)} v = \begin{cases} v, \text{ if } \alpha = \beta\\ 0, \text{ if } \alpha \neq \beta \end{cases} \tag{*}$$ where $P_{G(\alpha, T)}$ is the orthogonal projection of $V$ onto $G(\alpha, T)$.

Let $\lambda_1, \dots, \lambda_m$ denote the distinct eigenvalues of $T$. We have
$$T = T|_{G(\lambda_1, T)}P_{G(\lambda_1, T)} + \dots + T|_{G(\lambda_m, T)}P_{G(\lambda_m, T)}.$$ For each $j = 1, \dots, m$, we can write $T|_{G(\lambda_j, T)} = \lambda_j I + N_j$ where $N_j$ is a nilpotent operator under which $G(\lambda_j, T)$ is invariant (see 8.21 (c)). Therefore
\begin{aligned} T &= (\lambda_1 I + N_1)P_{G(\lambda_1, T)} + \dots + (\lambda_m I + N_m)P_{G(\lambda_m, T)}\\ &= \underbrace{\lambda_1 P_{G(\lambda_1, T)} + \dots + \lambda_m P_{G(\lambda_m, T)}}_\text{(4)} + \underbrace{N_1P_{G(\lambda_1, T)} + \dots + N_mP_{G(\lambda_m, T)}}_\text{(5)}. \end{aligned} Fix $k \in \{1, \dots, n\}$. Then $v_k \in G(\lambda_j, T)$ for some $j \in \{1, \dots m\}$. $(*)$ shows that $(4)$ maps $v_k$ to $\lambda_j v_k$. Hence $v_1, \dots, v_n$ is a basis of eigenvectors of $(4)$ and so $(4)$ is diagonalizable. $(*)$ also shows that $(5)$ maps $v_k$ to $N_j v_k$. But $G(\lambda_j, T)$ is invariant under $N_j$, so $(*)$ actually implies that $(5)$ raised to the power of $\operatorname{dim} V$ maps $v_k$ to $N_j^{\dim V}v_k$ which equals $0$. Therefore $(5)$ is nilpotent. It is easy to see that $(4)$ and $(5)$ commute (they map $v_k$ to $\lambda_j N_j v_k$, no matter the order), which completes the proof.

11. Solution: Suppose $T$ has an upper-triangular matrix with respect to the basis $v_1, \dots, v_n$. Suppose also that $\lambda$ appears on the $j$-th diagonal entry of $\mathcal{M}(T)$. Then
$$Tv_j = a_1 v_1 + \dots + a_{j-1} v_{j-1} + \lambda v_j$$ for some $a_1, \dots, a_{j-1} \in \mathbb{F}$, and so
$$(T – \lambda I)v_j = a_1 v_1 + \dots + a_{j-1} v_{j-1} \in \operatorname{span}(v_1, \dots, v_{j-1}).$$ We have
$$(T – \lambda I)v_{j-1} = c_1 v_1 + \dots + (c_{j-1} – \lambda) v_{j-1}$$ for some $c_1, \dots, c_{j-1} \in \mathbb{F}$. If $c_{j-1} – \lambda = 0$, then $(T – \lambda I)^2v_j \in \operatorname{span}(v_1, \dots, v_{j-2})$. If $c_{j-1} – \lambda \neq 0$, then $$(T – \lambda I)\left(v_j – \frac{a_{j-1}}{c_{j-1} – \lambda}v_{j-1}\right) \in \operatorname{span}(v_1, \dots, v_{j-2}).$$ We go on, either squaring by squaring $(T – \lambda I)$ or subtracting a vector $u \in \operatorname{span}(v_1, \dots, v_{j-1})$ from $v_j$ in the argument of $(T – \lambda I)$, and we will have $(T – \lambda I)^2(v_j – u)$ in the span of the first $j – 3$ vectors of the basis, then in span of the first $n – 4$, an so on until it will be in the span of an empty list, that is, $\{0\}$. This means that $v_j – u \in G(\lambda, T)$ for some $u \in \operatorname{span}(v_1, \dots, v_{j-1})$.

Let $\nu_1, \dots, \nu_d$ denote the vectors of the chosen basis of $V$ that correspond to the columns of $\mathcal{M}(T)$ in which $\lambda$ appears and in the order that they appear on the basis. We can repeat the previous process and find $u_1, \dots, u_d$ such
$$\nu_1 – u_1, \dots, \nu_d – u_d \in G(\lambda, T), \tag{6}$$ where each $u_k$ is in the span of the basis vectors that come before $\nu_k$. We claim this list is linearly independent. To see this, fix $k \in \{1, \dots, d\}$. Suppose $\nu_k$ is the $j$-th basis vector, i.e. $\nu_k = v_j$. This means that
$$\operatorname{span}(\nu_1 – u_1, \dots, \nu_{k-1} – u_{k-1}) \subset \operatorname{span}(v_1, \dots, v_{j-1}).$$ Therefore, we can’t have $\nu_k – u_k \in \operatorname{span}(\nu_1 – u_1, \dots, \nu_{k-1} – u_{k-1})$, because that would imply that
$$v_j \in \operatorname{span}(v_1, \dots, v_{j-1}),$$ since $u_k \in \operatorname{span}(v_1, \dots, v_{j-1})$. This argument can be repeated for each $k$. Therefore no vector in $(6)$ is in the span of the previous ones. It follows that the list in $(6)$ is linearly independent.

Hence $\operatorname{dim} G(\lambda, T) \ge d$. By 8.26, the dimension of $G(\lambda, T)$ cannot be greater $d$, because we have $\operatorname{dim} V$ diagonal entries and each one adds at least $1$ to the dimension of some generalized eigenspace. Thus $\operatorname{dim} G(\lambda, T) = d$, completing the proof.

### This Post Has 18 Comments

1. A sketch of an alternative proof of 11:

Lemma 1. Suppose 𝔽∈{ℝ,ℂ}, V is a finite dimensional vector space over 𝔽, T is a linear operator on V, $b_1, \dots, b_n$ is a basis of V, the matrix of T w.r.t. $b_1, \dots, b_n$ is upper triangular, its m-th diagonal element is $\lambda_1$, its (m+1)-st diagonal element is $\lambda_2 \neq \lambda_1$. Then there exists a vector c such that $b_1, \dots, b_{m-1}, c, b_m, b_{m+2}, \dots, b_n$ is a basis of V (we replaced $b_{m+1}$ with $c$ and also changed order of 2 vectors), the matrix of T w.r.t. this new basis is upper triangular, and its m-th and (m+1)-st diagonal elements are swapped.

In other words, lemma 1 says that given an operator with an upper triangular matrix, if in some place there are two different diagonal elements, we can swap them.

So, if we have T with some matrix which contains k occurences of some $\lambda$ on the diagonal, applying lemma 1 to T many times, we can find a basis of V so that in the matrix of T w.r.t. this new basis, on the diagonal, all occurences of $\lambda$ go before the occurences of all other eigenvalues, and this matrix is upper triangular. Therefore, $\operatorname{dim} G(\lambda, T) \geq k$.

Do this for each distinct eigenvalue, we get that the dimension of each generalized eigenspace is at least its count on the main diagonal. Because sum of their dimensions can't exceed n, their dimensions must be exactly their counts on the diagonal. QED

2. Am I correct thinking that the theorem 8.31 holds not only for ℂ, but for all fields?

1. Actually, nevermind - it doesn't work in ℤ/(2), because 2=1+1=0 has no multiplicative inverse in that field.

3. For equation (3) in your solution to question 3, why is there a (non-strict) inequality there. Why can't we write this as an equality (which seems to me to follow from 8.21 a) because we already showed that there is a bijective correspondence between the eigenvalues of $T$ and $S^-1TS$?

1. Edit: It should be $S^{-1}TS$ at the bottom of the comment.

4. For #6, I think you meant to raise the right hand side of the first line to power of 2.

1. Agreed, there is a mistake there. Although I think he meant to write $\sqrt{I+N}$ on the left-hand side of the equality, rather than just $I+N$.

1. No, I think Linearity mean to square right-hand side so it would equal the second line.

5. Q11, I feel there might be another approach, using Q3. T' be the upper triangular matrix, S be the change of basis matrix such that S^-1T'S has the block diagonal form with \lambda_i having multiplicity d_i. Since these two transformations (I use matrices and transformations interchangeably as Q3 can be extended to hold for matrices) are conjugate, dim null(T'-\lambda_i)^n = d_i.

Now let's take an eigenvalue \lambda. It must appear on diagonal of T'. (T'-\lambda) has zeroes, say, k places. We must now show k = dim null(T'-\lambda)^n which is the multiplicity. To see that, observe, (T'-\lambda)^n is also upper triangular and has diagonal elements raised to the power n. So it has 0 at k places, Now we try to find the null space of this matrix. If we multiply it with (x_1,...,x_n) then we get a system of linear equations. Since it is an upper triangular matrix, it is in Row Echelon form. We see leading entries of row = 0 at k places. Hence k = dim null(T'-\lambda)^n.

Hope this makes sense. I found it much easier to come up with. Btw, your website is a life saver! Thank you!

6. 8B.8) null T^(n-2) = null T^(n)

is not true. 8.4) states that n = dim V; null T^n = null T^(n+1) and doesn't statisfy null T^(n-1) = null T^(n)

7. For 8B.2, I think you meant to write T(x, y, z) = (0, -z, y)

1. Yes, thanks.

8. For 10, I think if we consider in the matrix version , then we can decomposite it directly by theorem 8.29

1. Yes, for $\mathbb C$ it is easy. This is a special case of Jordan-Chevalley decomposition Theorem for algebraically closed field.

9. so it follows from 8B.8 :
If $T$ has at least 2 distinct egeinvalues , then $$V=\mathrm{range} T^{n-2}\oplus\mathrm{null}~ T^{n-2}.$$Or replace the $2$ by any $k$, the conclusion will also hold.

10. Hey, I would appreciate a lot a solution for 8B8 :)

1. Hint: prove that nullT^(n-2) = nullT^(n-1)

11. 8B1) according to 8.21a: $V=G(0,N)$
so $G(0,N)=\mathrm{null}(N-0I)^{\dim V}=null(N)^{\dim V}=V$
according to 8.21c: \$(N-0I)|_{G(0,N) = N|V = N}, so N is nilpotent