logo

Uniform Uncertainty Principle: Restricted Isometry Condition 📂Statistical Analysis

Uniform Uncertainty Principle: Restricted Isometry Condition

Definition

v0:=suppv=card{k:(v)k0} \begin{align*} \left\| \mathbf{v} \right\|_{0} :=& \left| \operatorname{supp} \mathbf{v} \right| \\ =& \operatorname{card} \left\{ k : \left( \mathbf{v} \right) _{k} \ne 0 \right\} \end{align*} For a vector vV\mathbf{v} \in V in the vector space VV, let’s define the l0l_{0}-norm v0:VZ\left\| \mathbf{v} \right\|_{0} : V \to \mathbb{Z} as the cardinality of the support, that is, the number of non-zero components 00. 1\left\| \cdot \right\|_{1} is the l1l_{1}-norm, 2\left\| \cdot \right\|_{2} is the l2l_{2}-norm.

Uncertainty Principle in DFT 1

Let’s denote the Discrete Fourier Transform of a finite vector xCNx \in \mathbb{C}^{N} of length NNN \in \mathbb{N} as x^\hat{x}. The following statement is called the Uncertainty Principle for Discrete Fourier Transform: x0x^0N \left\| x \right\|_{0} \left\| \hat{x} \right\|_{0} \ge N

Restricted Isometry Property 2

Let’s denote the components of the vector sRp\mathbf{s} \in \mathbb{R}^{p} and the columns of the matrix ΘRn×p\Theta \in \mathbb{R}^{n \times p} corresponding to the T{1,,p}T \subset \left\{ 1 , \cdots , p \right\}th ones as sTRT\mathbf{s}_{T} \in \mathbb{R}^{\left| T \right|} and ΘTRn×T\Theta_{T} \in \mathbb{R}^{n \times \left| T \right|}, respectively. Given a natural number SNS \in \mathbb{N}, for all TT and sRp\mathbf{s} \in \mathbb{R}^{p} satisfying TS\left| T \right| \le S, there exists a SS-restricted Isometry Constant δS\delta_{S} such that (1δS)sT22ΘTsT22(1+δS)sT22 \left( 1 - \delta_{S} \right) \left\| \mathbf{s}_{T} \right\|_{2}^{2} \le \left\| \Theta_{T} \mathbf{s}_{T} \right\|_{2}^{2} \le \left( 1 + \delta_{S} \right) \left\| \mathbf{s}_{T} \right\|_{2}^{2} holds, then Θ\Theta is said to follow the Restricted Isometry Hypothesis or Uniform Uncertainty Principle.

Explanation

Compressed Sensing

Compressed sensing, well-known in optimization problems, is often described as follows: Minimizes0subject toy=Θs \begin{matrix} \text{Minimize} & \left\| \mathbf{s} \right\|_{0} \\ \text{subject to} & \mathbf{y} = \Theta \mathbf{s} \end{matrix} The Heisenberg inequality in Fourier analysis, especially known in quantum mechanics as Heisenberg’s Uncertainty Principle, was mentioned by Donoho, a pioneer of compressed sensing, in the 5th theorem of a paper published in 1989 under the keyword Signal Recovery, which presents the condition for the uniqueness of the solution, and in the 9th theorem when y=Θs\mathbf{y} = \Theta \mathbf{s}, arg mins0=arg mins1 \argmin \left\| \mathbf{s} \right\|_{0} = \argmin \left\| \mathbf{s} \right\|_{1} which was relaxed as follows: Minimizes1subject toy=Θs \begin{matrix} \text{Minimize} & \left\| \mathbf{s} \right\|_{1} \\ \text{subject to} & \mathbf{y} = \Theta \mathbf{s} \end{matrix} Candès and Tao went further under the condition that Θ\Theta follows the SS-restricted Isometry Hypothesis, when the solution s\mathbf{s}^{\ast} is sufficiently sparse, Minimizes1subject toyΘs2ε \begin{matrix} \text{Minimize} & \left\| \mathbf{s} \right\|_{1} \\ \text{subject to} & \left\| \mathbf{y} - \Theta \mathbf{s} \right\|_{2} \le \varepsilon \end{matrix} they showed that the optimal solution s^\hat{\mathbf{s}} satisfies the following inequality for some constant C>0C > 0: s^s2Cε \left\| \hat{\mathbf{s}} - \mathbf{s}^{\ast} \right\|_{2} \le C \varepsilon This essentially presents a condition under which the NP-hard problem of compressed sensing is easily solvable, and currently, this condition is widely used across various fields based on these findings.

The problem is that it’s challenging to simultaneously excel in Fourier analysis and data science, making it difficult to understand the relationship between the uncertainty principle and the restricted isometry property. Even Donoho’s paper continuously pushes the term uncertainty principle, making it quite troublesome to read, but at least Candès, Tao, and others explain it in terms of ‘matrix and restricted isometry condition’ in several papers in a relatively easy-to-understand manner mathematically, and similar explanations are made in other papers citing them. 34

Restricted Isometry Property and Column-wise Normalization

Stepping back a bit from the practical aspect, let’s intuitively understand why Θ\Theta is normalized column-wise in sparse regression when a matrix equation like y=Θs\mathbf{y} = \Theta \mathbf{s} is given.

Looking back at the inequality presented in the restricted isometry condition, (1δS)sT22ΘTsT22(1+δS)sT22 \left( 1 - \delta_{S} \right) \left\| \mathbf{s}_{T} \right\|_{2}^{2} \le \left\| \Theta_{T} \mathbf{s}_{T} \right\|_{2}^{2} \le \left( 1 + \delta_{S} \right) \left\| \mathbf{s}_{T} \right\|_{2}^{2} it’s similar to the formula that comes from equivalence of norms learned in linear algebra.

Equivalence of Norms: For two norms α,β\left\| \cdot \right\|_{\alpha}, \left\| \cdot \right\|_{\beta} defined on the vector space VV and any vector vV\mathbf{v} \in V, cvαvβCvα c \left\| \mathbf{v} \right\|_{\alpha} \le \left\| \mathbf{v} \right\|_{\beta} \le C \left\| \mathbf{v} \right\|_{\alpha} If there exist constants c,C>0c , C >0 that satisfy the above, the two norms are equivalent.

However, the results of Donoho, Candès, and Tao essentially say that the solution of arg min1\argmin \left\| \cdot \right\|_{1} is also a solution of arg min0\argmin \left\| \cdot \right\|_{0}, and in this sense, the appearance of formulas like the restricted isometry condition seems quite related at an intuitive level. Of course, one could check if ?=ΘT2\left\| \cdot \right\|_{?} = \left\| \Theta_{T} \cdot \right\|_{2} is really a norm, but for now, let’s move on to the topic of tight frames in Hilbert space.

Frames in Hilbert Space: For a sequence {vk}kN\left\{ \mathbf{v}_{k} \right\}_{k \in \mathbb{N}} in the Hilbert space HH, if there exists A,B>0A,B > 0 that satisfies the following, {vk}kN\left\{ \mathbf{v}_{k} \right\}_{k \in \mathbb{N}} is called a frame, and especially when A=BA = B, this frame is said to be tight. Av2kNv,vk2Bv2,vH A \left\| \mathbf{v} \right\|^{2} \le \sum_{k \in \mathbb{N}} \left| \left\langle \mathbf{v} , \mathbf{v}_{k} \right\rangle \right|^{2} \le B \left\| \mathbf{v} \right\|^{2} \qquad , \forall \mathbf{v} \in H Especially if {vk}kN\left\{ \mathbf{v}_{k} \right\}_{k \in \mathbb{N}} is an orthonormal basis of HH, it’s equivalent to being a tight frame of A=B=1A=B=1.

Equivalence Conditions for Orthonormal Basis: Assuming HH is a Hilbert space, for an orthonormal system {ek}kNH\left\{ \mathbf{e}_{k} \right\}_{k \in \mathbb{N}} \subset H in HH, the following are all equivalent:

  • (i): {ek}kNH\left\{ \mathbf{e}_{k} \right\}_{k \in \mathbb{N}} \subset H is an orthonormal basis of HH.
  • (iv): For all xH\mathbf{x}\in H, kNx,ek2=x2 \sum_{k \in \mathbb{N}} \left| \langle \mathbf{x}, \mathbf{e}_{k} \rangle \right|^{2} = \left\| \mathbf{x}\right\|^{2}

The fact that a frame becomes A=B=1A = B = 1 means that for ΘTsT2sT\left\| \Theta_{T} \mathbf{s}_{T} \right\|_{2} \approx \left\| \mathbf{s}_{T} \right\|, or a little loosely for some small δ>0\delta > 0, 1δ=A1B=1+δ 1 - \delta = A \approx 1 \approx B = 1 + \delta it implies that ΘT\Theta_{T} forms a ’not perfect but’ tight frame, and its orthonormality, though ’not perfect,’ can be inferred as a necessary and sufficient condition. Therefore, in practice, when performing calculations, sometimes the columns of the matrix are divided so that the sum of squares of each column becomes 11, and Candès and Tao’s papers are cited in such instances. 5

Relationship with Lasso

Minimizes1subject toyΘs2ε \begin{matrix} \text{Minimize} & \left\| \mathbf{s} \right\|_{1} \\ \text{subject to} & \left\| \mathbf{y} - \Theta \mathbf{s} \right\|_{2} \le \varepsilon \end{matrix}

Compressed sensing, as described above, can be reformulated as an optimization problem to find the following optimal solution according to the Lagrange multiplier method: arg mins1+λyΘs2 \argmin \left\| \mathbf{s} \right\|_{1} + \lambda \left\| \mathbf{y} - \Theta \mathbf{s} \right\|_{2}

Meanwhile, Lasso regression is described as the following optimization problem:

Lasso Regression: The following optimization problem is called BPDN, and solving BPDN is referred to as Lasso or Lasso Regression. arg minβ(12YXβ22+λβ1) \argmin_{\beta} \left( {{ 1 } \over { 2 }} \left\| Y - X \beta \right\|_{2}^{2} + \lambda \left\| \beta \right\|_{1} \right)

This means that the two problems are identical mathematically, and this difference can be explained as whether it’s a term used by statisticians or signal analysts. 6 In practical terms, it’s sufficient to know that they are essentially the same, but as discussed, it’s crucial to be aware that there are indeed conditions for them to be equivalent. To summarize once again:

  • Under the restricted isometry condition, compressed sensing can be performed through Lasso Regression.
    • However, Lasso regression is not the only method for performing compressed sensing.
  • Strictly speaking, they are different.
    • However, there’s rarely a need to distinguish them strictly.

  1. Y. Lyubarskii and R. Vershynin, “Uncertainty Principles and Vector Quantization,” in IEEE Transactions on Information Theory, vol. 56, no. 7, pp. 3491-3501, July 2010, doi: https://doi.org/10.1109/TIT.2010.2048458↩︎

  2. Candès, E.J., Romberg, J.K. and Tao, T. (2006), Stable signal recovery from incomplete and inaccurate measurements. Comm. Pure Appl. Math., 59: 1207-1223. https://doi.org/10.1002/cpa.20124 ↩︎

  3. E. J. Candes and T. Tao, “Near-Optimal Signal Recovery From Random Projections: Universal Encoding Strategies?,” in IEEE Transactions on Information Theory, vol. 52, no. 12, pp. 5406-5425, Dec. 2006, doi: https://doi.org/10.1109/TIT.2006.885507↩︎

  4. Needell, D., Vershynin, R. Uniform Uncertainty Principle and Signal Recovery via Regularized Orthogonal Matching Pursuit. Found Comput Math 9, 317–334 (2009). https://doi.org/10.1007/s10208-008-9031-3 ↩︎

  5. Brunton. (2016). Discovering governing equations from data by sparse identification of nonlinear dynamical systems: https://doi.org/10.1073/pnas.1517384113 ↩︎

  6. https://stats.stackexchange.com/a/328731/172321 ↩︎