What is the SINDy Algorithm?
Algorithm 1
Let’s consider a dynamical system whose state space is $\mathbb{R}^{n}$ and is given by the smooth function $f : \mathbb{R}^{n} \to \mathbb{R}^{n}$ as follows. $$ \dot{\mathbf{x}} = f \left( \mathbf{x} \right) $$ Briefly representing as $\mathbf{x} = \left( x_{1} , \cdots , x_{n} \right)$ and $\dot{\mathbf{x}} = \left( \dot{x_{1}} , \cdots , \dot{x_{n}} \right)$, for $m \in \mathbb{N}$ time points $\left\{ t_{k} \right\}_{k=1}^{m}$, a design matrix is constructed as follows. $$ \begin{align*} X =& \begin{bmatrix} \mathbf{x}^{T} \left( t_{1} \right) \\ \mathbf{x}^{T} \left( t_{2} \right) \\ \vdots \\ \mathbf{x}^{T} \left( t_{m} \right) \end{bmatrix} = \begin{bmatrix} x_{1} \left( t_{1} \right) & x_{2} \left( t_{1} \right) & \cdots & x_{n} \left( t_{1} \right) \\ x_{1} \left( t_{2} \right) & x_{2} \left( t_{2} \right) & \cdots & x_{n} \left( t_{2} \right) \\ \vdots & \vdots & \ddots & \vdots \\ x_{1} \left( t_{m} \right) & x_{2} \left( t_{m} \right) & \cdots & x_{n} \left( t_{m} \right) \end{bmatrix} \\ \dot{X} =& \begin{bmatrix} \dot{\mathbf{x}}^{T} \left( t_{1} \right) \\ \dot{\mathbf{x}}^{T} \left( t_{2} \right) \\ \vdots \\ \dot{\mathbf{x}}^{T} \left( t_{m} \right) \end{bmatrix} = \begin{bmatrix} \dot{x}_{1} \left( t_{1} \right) & \dot{x}_{2} \left( t_{1} \right) & \cdots & \dot{x}_{n} \left( t_{1} \right) \\ \dot{x}_{1} \left( t_{2} \right) & \dot{x}_{2} \left( t_{2} \right) & \cdots & \dot{x}_{n} \left( t_{2} \right) \\ \vdots & \vdots & \ddots & \vdots \\ \dot{x}_{1} \left( t_{m} \right) & \dot{x}_{2} \left( t_{m} \right) & \cdots & \dot{x}_{n} \left( t_{m} \right) \end{bmatrix} \in \mathbb{R}^{m \times n} \end{align*} $$ A matrix $\Theta \left( X \right) \in \mathbb{R}^{m \times p}$, made of derived variables obtained by applying some nonlinear function to the independent variables $X$, is called a Library, and each column is also referred to as basis or candidate. $$ \dot{X} = \Theta \left( X \right) \Xi $$ Through sparse regression by performing STLSQ on the above matrix equation to find as sparse a $\Xi \in \mathbb{R}^{p \times n}$ as possible, the algorithm that identifies the governing equation of the given dynamical system is called SINDy (Sparse Identification of Nonlinear Dynamical systems).
Description
SINDy is one of the methods to find the sparsest possible differential equations from given data and to find a Data-driven Model. It was proposed in 2011 by Professor Ying-Chen Lai and others, similar to the compressed sensing method2, but uses an overdetermined system instead of an underdetermined system, and is an algorithm that improves numerical robustness for noisy data.
STLSQ
STLSQ: Consider a sparse regression problem to find as sparse a $B \in \mathbb{R}^{p \times n}$ as possible for a full rank matrix $X \in \mathbb{R}^{m \times p}$ and $Y \in \mathbb{R}^{m \times n}$ under the given matrix equation $$ Y = X B $$ STLSQ, used in the paper by Brunton who proposed the SINDy algorithm, repetitively applies least squares method and processes components whose absolute value drops below a given threshold $\lambda$ as $0$ and removes those columns (independent variables) accordingly.
Although it’s said that using STLSQ on data from a dynamical system is called the SINDy algorithm, when you think about the actual relationship, the method specifically designed for SINDy is STLSQ itself. $$ \begin{align*} \text{STLSQ} \cdots& \text{SINDy} \\ Y =& \dot{X} \\ X =& \Theta (X) \\ B =& \Xi \end{align*} $$
Library
$$ \Theta (X) = \begin{bmatrix} 1 & x & y & z & x^{2} & xy & xz & y^{2} & \cdots & z^{5} \end{bmatrix} $$ The library $\Theta (X)$ can just be filled with polynomials as shown. For fairly simple smooth systems, using sufficiently higher orders guarantees mathematically that $X$ can be approximated with polynomials of $\dot{X}$ according to the Stone-Weierstrass theorem, and the trigonometric functions $\sin (kx)$ and $\cos (kx)$ are also sometimes referred to as Fourier bases3.
The figure above explains very well how SINDy works in practice. $$ \begin{align*} {{dx} \over {dt}} =& - \sigma x + \sigma y \\ {{dy} \over {dt}} =& - xz + \rho x - y \\ {{dz} \over {dt}} =& xy - \beta z \end{align*} $$ If we assume a Lorenz attractor, data is sampled from its trajectories, then the original ‘differential equations themselves’ are restored using STLSQ as described above. Since the derivatives of the Lorenz attractor are composed of polynomials of second degree or lower such as $x$, $y$, $xz$, $xy$, the governing equations can be successfully identified.
Brunton. (2016). Discovering governing equations from data by sparse identification of nonlinear dynamical systems: https://doi.org/10.1073/pnas.1517384113 ↩︎
Wang, W. X., Yang, R., Lai, Y. C., Kovanis, V., & Grebogi, C. (2011). Predicting catastrophes in nonlinear dynamical systems by compressive sensing. Physical review letters, 106(15), 154101. https://doi.org/10.1103/PhysRevLett.106.154101 ↩︎