logo

Proof of the Existence of Smith Normal Form 📂Topological Data Analysis

Proof of the Existence of Smith Normal Form

Algorithm

$R$ is a principal ideal domain, for every matrix $A \in R^{m \times n}$, there exists a unique Smith normal form. In other words, for the matrix $A \in R^{m \times n}$, there exists $d_{1} , \cdots , d_{r} \in R$ and invertible matrices $P \in R^{m \times m}$, $Q \in R^{n \times n}$ that satisfy $$ PAQ = \begin{bmatrix} d_{1} & 0 & 0 & 0 & \cdots & 0 \\ 0 & \ddots & 0 & 0 & \cdots & 0 \\ 0 & 0 & d_{r} & 0 & \cdots & 0 \\ 0 & 0 & 0 & 0 & \cdots & 0 \\ \vdots & \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & 0 & \cdots & 0 \end{bmatrix} \in R^{m \times n} $$ where $d_{1} , \cdots , d_{r}$ should not be $0$ in $R$, and must be $d_{1} \mid \cdots \mid d_{r}$. More specifically, $d_{k} \ne 0$ must be a divisor of $d_{k+1}$. The uniqueness ignores the unit multiples of $d_{k}$.

Proof 1 2

Strategy: An algorithm that uses row and column exchanges and Gaussian elimination to directly find the Smith normal form and prove its existence.

Every principal ideal domain is a unique factorization domain.

Matrix $A$, being a matrix over a PID, is also a matrix over a UFD. Therefore, all elements except for the units and $0$, represented by $A_{ij}$, have a unique factorization, and the one with the least number of factors is called the Minimal Entry, denoted by $\alpha > 0$.

It should be noted that in Munkres’ textbook, the smallest element is chosen as $\mathbb{Z}^{m \times n}$, but in general principal ideal domains, which are not necessarily Euclidean domains, it can be tricky to consider something as ‘smallest’. In the general case of a PID, it makes sense to think about factorization to determine the minimal entry neatly. However, for simplicity, this post might use terms like ‘small’ or $a < b$, which should be understood in terms of comparing the number of factors rather than the size of the elements.

Keep in mind that the given matrix will continually change due to Gaussian elimination, so $\alpha$ also continually changes in context.

It’s also worth noting that this algorithm specifically demonstrates that the Smith normal form can always be derived for the given $A$, although it’s not necessarily efficient and $P, Q$ is unknown.


Step 1.

If $A$ is a zero matrix, it is already in Smith normal form, so proceed to Step 4. If $A$ is not a zero matrix, then do the following.

For ease of understanding, first swap rows and columns to bring $\alpha$ to the top-left corner as $(1,1)$. Now, we will only focus near the first row and column of the given matrix. $$ A \sim \begin{bmatrix} \alpha & A_{12} & \cdots & \\ A_{21} & A_{22} & \cdots & \\ \vdots & \vdots & \ddots & \end{bmatrix} $$


Step 2. $k= 1, \cdots , r$

(Obviously, $r$ is unknown during the calculation. It naturally gets determined after the program ends.)

  • For all $i, j$, if $\alpha \mid A_{ij}$ then substitute $d_{k} \gets \alpha$. As $\alpha \mid A_{ij}$, it’s possible to repeat Gaussian elimination until all elements in the first row and column of $A$, except for $\alpha$, become $0$. After the repetition, delete the $1$ row and $1$ column and return to Step 1. Now, the size of the matrix $A$ becomes $(m - k) \times (n - k)$.
  • If for some $i, j$, $\alpha \nmid A_{ij}$, then move to Step 3.

Step 3.

From here on, the discussion can apply to either rows or columns, so without loss of generality, we’ll only discuss $\alpha \ne 0$ and $A_{12}$. Essentially, Step 2 guarantees that for at least one of $i, j$, $\alpha \nmid A_{ij}$ is true, and the goal is to perform Case 1 below after making sure that $A_{ij}$ comes under $(1,2)$.

Every principal ideal domain is a Bezout domain: PID is a Bezout domain, so the extended Euclidean theorem applies. This means, for all $a, b \in D$, there exists $m,n \in D$ that satisfies $$ m a + n b = \gcd \left( a, b \right) $$

  • Case 1. If $\alpha \nmid A_{12}$
    $\alpha$ is the minimal entry and since $\alpha \nmid A_{12}$, then $\gcd \left( \alpha, A_{12} \right) < \alpha$. According to the extended Euclidean theorem, $$ m \alpha + n A_{12} $$ there exists $m,n \in D$ that makes $\alpha$ smaller. By Gaussian elimination, add the column of $1$ multiplied by $m$ to the column of $2$ multiplied by $n$, and substitute it into the column of $2$. This new column’s first element, $A_{12}' = \gcd \left( \alpha , A_{12} \right)$, is not only smaller than the original $\alpha$ but also a divisor of $\alpha$, making it a candidate for a newer and smaller minimal entry. Swap the $1$ column and $2$ column and return to Step 1. (There might be a chance to find an even smaller minimal entry during the calculation.)
  • Case 2. If $\alpha \mid A_{12}$
    Some $k \in \mathbb{Z}$ exists that satisfies $k \alpha + A_{12} = 0$, so multiply $1$ column by $k$ and subtract it from the column of $2$. Then, this new column’s first element, $A_{12} ' $, becomes $0$. Repeat Step 3.
  • Case 3. If $A_{12} = 0$
    Repeat the column exchange until $A_{12} \ne 0$. If for all $j \ne 1$, every $A_{1j}$ becomes $0$, then conduct Step 3 for $A_{21}$.
    • If $\alpha \nmid A_{21}$, then $A_{21}' < \alpha$. Update the minimal entry.
    • If $\alpha \mid A_{21}$, make $A_{21} ' $ become $0$.
    • If $A_{21} = 0$, it means the matrix now looks like this: $$ A \sim \begin{bmatrix} \alpha & 0 & \cdots & \\ 0 & A_{22} & \cdots & \\ \vdots & \vdots & \ddots & \end{bmatrix} $$ Since $A_{21} = 0$, adding the row of $2$ to the row of $1$ keeps $\alpha$ and results in the form $$ A \sim \begin{bmatrix} \alpha & A_{22} & \cdots & \\ 0 & A_{22} & \cdots & \\ \vdots & \vdots & \ddots & \end{bmatrix} $$ Swap the $i$ row and the $2$ row and repeat Step 3. Even if all elements in the $1$ row and column except for $\alpha$ are $0$, there still must be at least one pair of $i,j \ge 2$ that satisfies $\alpha \nmid A_{ij}$, so Case 3 is repeated until eventually, $\alpha \nmid A_{12}$ must be true.

Case 1 allows the value of $A_{11}$ to be continually reduced, Case 2 allows for making all elements other than $\alpha$ in the first row and column become $0$, and Case 3 ensures that there will eventually be a $A_{ij}$ that satisfies $\alpha \nmid A_{ij}$ and is placed at $(1,2)$.


Step 4. $d_{1} \mid \cdots \mid d_{r}$

The only way to proceed to Step 4 is if the remaining $A$ from Step 1 is a zero matrix.

Throughout Step 2, we’ve been noting $d_{k}$ like $d_{k} \gets \alpha$. That all $A_{ij}$ could be divided by the minimal entry $\alpha$ indicates that any element created in the remaining matrix by the extended Euclidean theorem can be divided by this $\alpha$. Therefore, $d_{1} \mid \cdots \mid d_{r}$, and we obtain the following Smith normal form. $$ \begin{bmatrix} d_{1} & 0 & 0 & 0 & \cdots & 0 \\ 0 & \ddots & 0 & 0 & \cdots & 0 \\ 0 & 0 & d_{r} & 0 & \cdots & 0 \\ 0 & 0 & 0 & 0 & \cdots & 0 \\ \vdots & \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & 0 & \cdots & 0 \end{bmatrix} \in R^{m \times n} $$


  1. Munkres. (1984). Elements of Algebraic Topology: p55~57. ↩︎

  2. https://en.wikipedia.org/wiki/Smith_normal_form#Algorithm ↩︎