Let $A$ be an $n\times n$ matrix. If there exists a positive integer $q$ such that \begin{equation}\label{eq:1}A^{q}=0,\end{equation} then we call $A$ a **nilpotent matrix**, meaning that one of its powers is the zero matrix. Let $q$ be the smallest positive integer such that \eqref{eq:1} holds, then we call $q$ the **index** of $A$.

For instance, consider $$A=\begin{bmatrix} 0&1&0&0\\ 0&0&1&0\\ 0&0&0&1\\ 0&0&0&0 \end{bmatrix},~ B=\begin{bmatrix} 5&-3&2\\ 15&-9&6\\ 10&-6&4\end{bmatrix},$$ then $A$ and $B$ are nilpotent matrices. By direct computation, we have $A^3\ne 0$, $A^4=0$ and $B^2=0$. Therefore, the indices of $A$ and $B$ are $4$ and $2$, respectively.

Let us discuss the sufficient and necessary conditions regarding the relations between nilpotent matrix, trace, and eigenvalues.

**Theorem 1** If $A$ is a nilpotent matrix, then all its eigenvalues are zero. Conversely, if the eigenvalues of a square matrix $A$ are all zero, then $A$ is nilpotent.

Clearly, if $A^q=0$ for some positive integer $q$, then all eigenvalues of $A$ are zero; if $A$ has at least one eigenvalue which is nonzero, then $A^k\ne 0$ for all $k\in \mathbb Z_{\geqslant 0}$.

Let us prove this statement.

If $A^q=0$ and $\lambda$ is an eigenvalue of $A$ with the corresponding eigenvector $\mathbf{x}$. Then we have $A\mathbf{x}=\lambda \mathbf{x}$ and hence $$A^q\mathbf{x}=A^{q-1}A\mathbf{x}=\lambda A^{q-1}\mathbf{x}=\cdots=\lambda^q\mathbf{x}.$$

Since $A^q=0$, we conclude that $\lambda^{q}=0$ or $\mathbf{x}=\mathbf{0}$. But an eigenvector $\mathbf{x}$ cannot be zero, thus $\lambda=0$. It is also clear that the algeraic multiplicity is $n$.

Conversely, if the eigenvalues of an $n\times n$ matrix $A$ are all zero, then the characteristic polynomial of $A$ is $$P_A(t)=t^n.$$ It follows from Cayley-Hamilton Theorem that $A^n=0$ which shows that $A$ is nilpotent. From the proof, we also conclude that the index $q$ is at most $n$, namely $q\leqslant n$.

Another proof is using Schur Theorem: there exists an upper triangular matrix $T$ similar to $A$, $T=U^{-1}AU$, where $U$ is a unitary matrix $U^\ast=U^{-1}$ and the diagonal elements of $T$ are all zero. For example, $4\times 4$ upper triangular matrix $T$ has the following form:

$$T=\begin{bmatrix} 0&\ast&\ast&\ast\\ 0&0&\ast&\ast\\ 0&0&0&\ast\\ 0&0&0&0\end{bmatrix},$$

where $\ast$ may be any complex numbers. Computing the powers of $T$, we have

$$T^2=\begin{bmatrix} 0&0&\ast&\ast\\ 0&0&0&\ast\\ 0&0&0&0\\ 0&0&0&0\end{bmatrix},$$ $$T^3=\begin{bmatrix} 0&0&0&\ast\\ 0&0&0&0\\ 0&0&0&0\\ 0&0&0&0\end{bmatrix},$$ $$T^4=\begin{bmatrix} 0&0&0&0\\ 0&0&0&0\\ 0&0&0&0\\ 0&0&0&0\end{bmatrix}.$$

For arbitrary $n\times n$ upper triangular matrix $T$, there exists a smallest positive integer $q\leqslant n$ such that $T^q=0$. Hence $T$ is nilpotent. Using the similarity relations, $A=UTU^{-1}$, the power $A$ can be written as $$A^q=UT^qU^{-1},$$ therefore $A^q=0$, namely $A$ is nilpotent.

An immediate corollary of Theorem 1 is that nilpotent matrix is not invertible. This can also be proved using determinant instead of eigenvalues: let $q$ be the index of a nilpotent matrix $A$, then by the multiplicative formula of detminant, we have

$$\det(A^{q})=\det(\underbrace{A\cdots A}_q)=\underbrace{(\det A)\cdots(\det A)}_q=(\det A)^{q}.$$

But $$\det(A^{q})=\det 0=0,$$thus $\mathrm{det}A=0$, which implies that $\hbox{rank}A<n$.

Consider a special case. If a real $n\times n$ matrix $A$ satisfies $A^2=0$, then for any $\mathbf{x}\in\mathbb{R}^n$ we have $$A(A\mathbf{x})=\mathbf{0}.$$ This implies that the column space of $A$, $$C(A)=\{A\mathbf{x}|\mathbf{x}\in\mathbb{R}^n\}$$ is a subspace of the null space of $A$, $$N(A)=\{\mathbf{x}\in\mathbb{R}^n|A\mathbf{x}=\mathbf{0}\},$$namely $C(A)\subseteq N(A)$. Therefore \begin{equation}\label{eq:2}\hbox{rank}A=\dim C(A)\leqslant \dim N(A).\end{equation} By Rank-Nullity Theorem, we have \begin{equation}\label{eq:3}\dim N(A)=n-\hbox{rank}A.\end{equation}Combining \eqref{eq:2} and \eqref{eq:3}, we obtain that $$\hbox{rank}A\leqslant\frac{n}{2}.$$

Here is an example, the column space and null space of $$A=\begin{bmatrix} 0&0&1\\ 0&0&0\\ 0&0&0 \end{bmatrix}$$ are $C(A)=\hbox{span}\{(1,0,0)^T\}$ and $N(A)=\hbox{span}\{(1,0,0)^T,(0,1,0)^T\}$, respectively.

We look at another property of a nilpotent matrix $A$: $I-A$ is invertible. We can also find the explicit expression of the inverse matrix. Note that $$I-A^q=(I-A)(I+A+A^2+\cdots+A^{q-1}),$$however $A^q=0$, hence $$I=(I-A)(I+A+A^2+\cdots+A^{q-1}).$$Therefore, the inverse matrix of $A$ is $$(I-A)^{-1}=I+A+A^2+\cdots+A^{q-1}.$$Moreover, because by Theorem 1 all eigenvalues of $I-A$ are one and the determinant is the product of all eigenvalues, we have $$\det(I-A)=\det((I-A)^{-1})=1.$$

Except the condition that all eigenvalues are zero, there is another sufficient and necessary condition for a square matrix to be nilpotent described by trace.

**Theorem 2** For an $n\times n$ matrix $A$, then $A^n=0$ if and only if $$\hbox{trace}(A^k)=0,$$ for $k=1,\ldots,n$.

Let $\lambda_1,\ldots,\lambda_n$ be the eigenvalues of $A$. If $A^n=0$, it follows from Theorem 1 that $$\lambda_1=\cdots=\lambda_n=0.$$Therefore, the eigenvalues $\lambda_i^k$ of $A^k$ are also zero, we conclude that $$\hbox{trace}(A^k)=\sum_{i=1}^n\lambda_i^k=0,\quad k\ge 1.$$

Conversely, suppose $$\hbox{trace}(A^k)=\sum_{i=1}^n\lambda_i^k=0,$$ for $1\le k\le n$. This can be written in terms of matrix $$\begin{bmatrix} 1&1&\cdots&1\\ \lambda_1&\lambda_2&\cdots&\lambda_n\\ \vdots&\vdots&\ddots&\vdots\\ \lambda_1^{n-1}&\lambda_2^{n-1}&\cdots&\lambda_n^{n-1} \end{bmatrix} \begin{bmatrix} \lambda_1\\ \lambda_2\\ \vdots\\ \lambda_n \end{bmatrix}=\begin{bmatrix} 0\\ 0\\ \vdots\\ 0 \end{bmatrix}.$$Where the coefficient matrix is a Vandermonde matrix (see Special Matrix (8) Vandermonde matrix).

If all $\lambda_i$ are distinct, then this Vandermonde matrix is invertible and hence the equation has only trivial solution $$\lambda_1=\cdots=\lambda_n=0,$$contradicting with the assumption all $\lambda_i$ are distinct. Hence we must have $\lambda_i=\lambda_j$ for $i\ne j$, namely $A$ has multiple eigenvalues. Without loss of generality, we assume that $\lambda_1=\lambda_2$.

If all $\lambda_2,\dots,\lambda_n$ are distinct, then the following system of equation $$\begin{bmatrix} 1&1&\cdots&1\\ \lambda_2&\lambda_3&\cdots&\lambda_n\\ \vdots&\vdots&\ddots&\vdots\\ \lambda_2^{n-2}&\lambda_3^{n-2}&\cdots&\lambda_n^{n-2} \end{bmatrix} \begin{bmatrix} 2\lambda_2\\ \lambda_3\\ \vdots\\ \lambda_n \end{bmatrix}=\begin{bmatrix} 0\\ 0\\ \vdots\\ 0 \end{bmatrix}$$ has only the trivial solution. Similarly, we conclude two numbers of $\lambda_2,\lambda_3,\ldots,\lambda_n$ are equal. Repeating this procedure, finally we conclude that $$\lambda_1=\cdots=\lambda_n=0.$$Hence we are done.

In general, sum and product of two nilpotent matrices are not necessarily nilpotent. But if the two nilpotent matrices commute, then their sum and product are nilpotent as well.

**Theorem 3** If $A$ and $B$ are $n\times n$ nilpotent matrices and $AB=BA$, then $AB$ and $A+B$ are also nilpotent.

Because $A$ and $B$ are nilpotent, there must exists positive integers $p$ and $q$ such that $$A^p=B^q=0.$$

Let $m=\max\{p,q\}$, then $A^m=B^m=0$. Since $AB=BA$, we have $$(AB)^m = (ABAB)(AB)^{m-2}=A^2B^2(AB)^{m-2}=\cdots=A^mB^m = 0.$$Hence $AB$ is nilpotent.

Consider $$\displaystyle (A + B)^{2m}=\sum_{k=0}^{2m}\binom{2m}{k}A^kB^{2m-k}.$$For $0\leqslant k\leqslant 2m$, we always have $$\max\{k,2m-k\}\geqslant m$$ and hence $A^k=0$ or $B^{2m-k}=0$. Therefore, $(A + B)^{2m}= 0$. Thus $A+B$ is nilpotent.

Nonzero nilpotent can not be diagonalizable since $\hbox{rank}A>0$ and hence $$\dim N(A)=n-\hbox{rank}A<n.$$ This means the eigenspace corresponding to eigenvalue zero is of dimension less than $n$.

Understanding nilpotent matrices would be very helpful to understand the Jordan canonical form, we shall talk more about this.

Translated from: https://ccjou.wordpress.com/