Diagonalization

The payoff of eigenvectors

The Eigenvector Basis

In the previous chapter, we met eigenvectors: directions that a transformation scales without rotating. Now we discover why they matter so much.

If a matrix has enough independent eigenvectors, we can use them as a new basis. In this basis, the matrix becomes diagonal: all off-diagonal entries become zero.

Interactive: Compare the matrix in standard vs eigenvector basis

A = \begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix}

In the standard basis, A has off-diagonal entries - it mixes the coordinates.

A diagonal matrix is the simplest possible transformation: it scales each coordinate axis independently, with no mixing between them. The diagonal entries are the eigenvalues.

The Diagonalization Formula

If $A$ has $n$ independent eigenvectors, we can write:

A = PDP^{-1}

where $P$ is the matrix whose columns are the eigenvectors, and $D$ is the diagonal matrix of eigenvalues.

This formula says: to apply $A$ , first change to the eigenvector basis ( $P^{-1}$ ), then scale along the eigenvector directions ( $D$ ), then change back to the standard basis ( $P$ ).

A = P \begin{bmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{bmatrix} P^{-1}

Powers of Diagonal Matrices

The real power of diagonalization emerges when computing matrix powers. For a diagonal matrix:

D^n = \begin{bmatrix} \lambda_1^n & 0 \\ 0 & \lambda_2^n \end{bmatrix}

Just raise each diagonal entry to the power. No matrix multiplication required.

Interactive: Powers of a diagonal matrix

Power n1

D^{1} = \begin{bmatrix} 2^{1} & 0 \\ 0 & 0.5^{1} \end{bmatrix} = \begin{bmatrix} 2.0 & 0 \\ 0 & 0.50 \end{bmatrix}

Powers of a diagonal matrix are trivial: just raise each diagonal entry to that power. No matrix multiplication needed.

For the original matrix $A$ , this gives us:

A^n = PD^nP^{-1}

One change of basis, one trivial diagonal power, one change back. Computing $A^{100}$ becomes as easy as computing $A^2$ .

Convergence Under Repeated Transformation

When you apply a transformation repeatedly, eigenvectors reveal the long-term behavior. If all eigenvalues have magnitude less than 1, vectors shrink toward the origin. If one eigenvalue is larger than the others, vectors align with that eigenvector's direction.

Interactive: Watch vectors converge under repeated transformation

Power n1

As n increases, all vectors converge toward the dominant eigenvector (the one with eigenvalue closest to 1). The eigenvector direction is stable under repeated transformation.

The dominant eigenvector (associated with the largest eigenvalue) determines where things end up. This is why eigenvectors matter: they reveal the stable directions of a dynamical system.

Application: Fibonacci Numbers

The Fibonacci sequence satisfies $F_{n+1} = F_n + F_{n-1}$ . This recurrence can be encoded as matrix multiplication:

\begin{bmatrix} F_{n+1} \\ F_n \end{bmatrix} = \begin{bmatrix} 1 & 1 \\ 1 & 0 \end{bmatrix} \begin{bmatrix} F_n \\ F_{n-1} \end{bmatrix}

Therefore $F_n$ can be computed via matrix exponentiation. Using diagonalization, this takes $O(\log n)$ time instead of $O(n)$ .

Interactive: Fibonacci via matrix powers

\begin{bmatrix} 1 & 1 \\ 1 & 0 \end{bmatrix}^{10} = \begin{bmatrix} F_{11} & F_{10} \\ F_{10} & F_{9} \end{bmatrix}

F(n+1)

F(n)

n10

The Fibonacci matrix encodes the recurrence F(n+1) = F(n) + F(n-1). Computing F(n) via diagonalization takes O(log n) time instead of O(n).

Application: Markov Chains

A Markov chain describes a system that transitions between states with fixed probabilities. The transition matrix $P$ has columns that sum to 1.

The key question: where does the system end up after many transitions? The answer involves the eigenvector of $P$ with eigenvalue 1 - the steady state distribution.

Interactive: Markov chain converging to steady state

State Distribution

State A

100.0%

State B

0.0%

Time steps0

Starting from 100% in State A, the distribution converges to the steady state (57.1%, 42.9%). This steady state is the eigenvector of P with eigenvalue 1.

As $n \to \infty$ , the state distribution converges to the eigenvector for eigenvalue 1 (normalized to sum to 1). Diagonalization explains exactly how and why this convergence happens.

When Can We Diagonalize?

Not every matrix is diagonalizable. The requirement is that we need $n$ independent eigenvectors for an $n \times n$ matrix.

Matrices with all distinct eigenvalues are always diagonalizable
Symmetric matrices are always diagonalizable (with orthogonal eigenvectors)
Some matrices with repeated eigenvalues may not have enough eigenvectors (these are called defective)

When diagonalization fails, more advanced techniques like Jordan normal form or the singular value decomposition fill the gap.

Key Takeaways

If a matrix has $n$ independent eigenvectors, it can be written as $A = PDP^{-1}$
In the eigenvector basis, the transformation becomes diagonal - pure scaling along each axis
Powers of diagonal matrices are trivial: $D^n$ just raises each diagonal entry to the nth power
Matrix powers via diagonalization: $A^n = PD^nP^{-1}$
The dominant eigenvector determines long-term behavior under repeated transformation
Applications include Fibonacci numbers, Markov chains, differential equations, and any system involving repeated linear operations