logo

Sparse Matrices 📂Matrix Algebra

Sparse Matrices

Definition

In natural language, sparse refers to being thin or scarce in a way that a value is considered virtually non-existent if it is $0$. Sparsity indicates the degree to which something is made up of such $0$ values.

Sparse Matrix

A matrix whose elements are mostly $0$ is referred to as a Sparse Matrix.

$S$-Sparse 1

Even if there are many values, when there are only $S \ll d$ non-$0$ elements, $\mathbf{x}$ is said to be $S$-sparse. Here, the integer $S \in \mathbb{Z}$ for the vector $\mathbf{v}$’s support $\operatorname{supp} \mathbf{v} = \left\{ k : \mathbf{v}_{k} \ne 0 \right\}$ satisfies the following inequality: $$ \operatorname{card} \left( \operatorname{supp} \mathbf{v} \right) \le S $$ where $\operatorname{card}$ represents cardinality.

Explanation

Alternative terms include thin matrix and scarce matrix, but usually, neither is used, and it’s called by its English pronunciation, [Sparse Matrix]. As the term suggests, a sparse matrix is defined as a matrix with ‘meaningful’ information being sparse. As the word ‘mostly’ suggests, this isn’t a mathematically strict definition. When a specific $S$ is given, making it ‘sufficiently sparse,’ its definition becomes clear.

Optimization Theory

In [optimization theory](../../categories/optimization theory), sparse regression refers to the optimization problem of maximizing the sparsity of the solution, that is, minimizing $S$ in a regression problem.

Graph Theory

In [graph theory](../../categories/graph theory), the adjacency matrix of a random network given as data is assumed to be a sparse matrix.

Data Structures

In the realm of practical computation, sparse matrices are close to data structures. For instance, consider $X,Y \in \mathbb{R}^{10^{6} \times 10^{6}}$ where only the last element is $\sqrt{2}$, and the rest are $0$. To store this matrix, a $2 \cdot 10^{12}$-slot array filled with floating-point numbers is needed, but if it’s certain that it’s a sparse matrix, there’s no need to store all those $0$ values – just the last slot suffices. Not only does this make storage more convenient, but it also allows for more efficient operations. Despite these two matrices being enormously large, their product can simply be calculated as $$ X Y = \begin{bmatrix} 0 & 0 & \cdots & 0 \\ 0 & 0 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & 2 \end{bmatrix} $$ It would be unwise to use Strassen’s algorithm for this multiplication. ‘Computing sparse matrices on a computer’ means being able to read and write these matrices with less storage, and performing calculations very quickly by skipping parts that are $0$2.


  1. Needell, D., Vershynin, R. Uniform Uncertainty Principle and Signal Recovery via Regularized Orthogonal Matching Pursuit. Found Comput Math 9, 317–334 (2009). https://doi.org/10.1007/s10208-008-9031-3 ↩︎

  2. https://ko.wikipedia.org/wiki/%ED%9D%AC%EC%86%8C%ED%96%89%EB%A0%AC ↩︎