logo

Strassen Algorithm Proof 📂Matrix Algebra

Strassen Algorithm Proof

Algorithm

Let’s say for $k \in \mathbb{N}$ there is $n=2^{k}$. For $A, B \in \mathbb{R}^{n \times n}$, using the Jordan block matrix representation, let’s consider the following 8 matrices ${{n} \over {2}} \times {{n} \over {2}}$, $A_{i}$, and $B_{i}$. $$ AB= \begin{bmatrix} A_{1} & A_{2} \\ A_{3} & A_{4} \end{bmatrix} \begin{bmatrix} B_{1} & B_{2} \\ B_{3} & B_{4} \end{bmatrix} = \begin{bmatrix} C_{1} & C_{2} \\ C_{3} & C_{4} \end{bmatrix} = C $$ To calculate $C = AB$, compute the following. $$ P_{1} = A_{1} ( B_{2} - B_{4} ) \\ P_{2} = ( A_{1} + A_{2} ) B_{4} \\ P_{3} = ( A_{3} + A_{4} ) B_{1} \\ P_{4} = A_{4} ( B_{3} - B_{1} ) \\ P_{5} = (A_{1} + A_{4}) ( B_{1} + B_{4} ) \\ P_{6} = (A_{2} - A_{4}) ( B_{3} + B_{4} ) \\ P_{7} = (A_{3} - A_{1}) ( B_{1} + B_{2} ) $$

$$ C_{1} = -P_{2} + P_{4}+ P_{5} + P_{6} \\ C_{2} = P_{1} + P_{2} \\ C_{3} = P_{3} + P_{4} \\ C_{4} = P_{1} - P_{3} + P_{5} + P_{7} $$ When calculating $A_{i} B_{j}$, repeat the same method, and the time complexity is $\Theta ( n^{2.807} )$.


  • Of course, it is not that you cannot use $n \ne 2^{k}$, and you can apply it as is by adjusting the size.

Explanation

The multiplications of matrices that are straightforward and immediately come to mind are $$ C_{1} = A_{1}B_{1} + A_{2} B_{3} \\ C_{2} = A_{1}B_{2} + A_{2} B_{4} \\ C_{3} = A_{3}B_{1} + A_{4} B_{3} \\ C_{4} = A_{3}B_{2} + A_{4} B_{4} $$ Just keep repeating this calculation, and there was no doubt that the time it took was known as $\Theta ( n^3 )$ until Strassen was born.

This algorithm is very revolutionary because originally, it is definitely necessary to perform two matrix multiplications to calculate one $C_{i}$. Combining them requires calculation 8 times each time, but Strassen reduced this to 7 times. The consideration from the very beginning, whether there could be another method, is an extraordinary sense, and what was truly found can even be said to be a wonder. With the development of this algorithm, a significant amount of resources, if not a little, that would have been used for matrix calculations can be saved.

Proof

Part 1. Computability

There’s no choice but to calculate directly.

Part 1-1. $C_{1} = A_{1}B_{1} + A_{2} B_{3}$

$$ \begin{align*} C_{1} =& -P_{2} + P_{4} + P_{5} + P_{6} \\ =& ( {\color{red}- A_{1} B_{4}} - A_{2} B_{4} ) + P_{4} + ( A_{1}B_{1} + {\color{red} A_{1}B_{4} } + A_{4} B_{1} + A_{4}B_{4} ) + P_{6} \\ =& - A_{2} B_{4} + (A_{4} B_{3} {\color{red}- A_{4} B_{1} }) + A_{1}B_{1} + {\color{red} A_{4} B_{1} } + A_{4}B_{4} + P_{6} \\ =& {\color{red} - A_{2} B_{4} } + \color{blue}{A_{4} B_{3} } + A_{1}B_{1} + \color{green}{A_{4}B_{4}} + ( A_{2} B_{3} + {\color{red} A_{2} A_{4} } \color{blue}{ - A_{4} B_{3} }- \color{green}{A_{4} B_{4}}) \\ =& A_{1}B_{1} + A_{2} B_{3} \end{align*} $$

Part 1-2. $C_{2} = A_{1}B_{2} + A_{2} B_{4}$

$$ \begin{align*} C_{2} =& P_{1} + P_{2} \\ =& (A_{1}B_{2} {\color{red}- A_{1} B_{4}}) + ( {\color{red}A_{1} B_{4}} + A_{2} B_{4}) \\ =& A_{1}B_{2} + A_{2} B_{4} \end{align*} $$

Part 1-3. $C_{3} = A_{3}B_{1} + A_{4} B_{3}$

$$ \begin{align*} C_{3} =& P_{3} + P_{4} \\ =& (A_{3}B_{1} + {\color{red} A_{4}B_{1}}) + (A_{4}B_{3} {\color{red}- A_{4}B_{1}}) \\ =& A_{3}B_{1} + A_{4} B_{3} \end{align*} $$

Part 1-4. $C_{4} = A_{3}B_{2} + A_{4} B_{4}$

$$ \begin{align*} C_{4} =& P_{1} - P_{3} + P_{5} + P_{7} \\ =& (A_{1}B_{2} {\color{red} - A_{1} B_{4} } ) - P_{3} + ( A_{1}B_{1} + {\color{red} A_{1}B_{4} } + A_{4}B_{1} + A_{4}B_{4}) + P_{7} \\ =& A_{1}B_{2} + (- A_{3}B_{1} - {\color{red}A_{4}B_{1}}) + A_{1}B_{1} + {\color{red}A_{4}B_{1}} + A_{4}B_{4} + P_{7} \\ =& {\color{red} A_{1}B_{2}} - \color{blue}{A_{3}B_{1}} + \color{green}{A_{1}B_{1}}+ A_{4}B_{4} + ( \color{blue}{A_{3}B_{1}} + A_{3}B_{2} - \color{green}{A_{1}B_{1}} {\color{red}- A_{1}B_{2}}) \\ =& A_{3}B_{2} + A_{4} B_{4} \end{align*} $$


Part 2. Time Complexity

Essentially, it is the same as the method that proved that the original calculation method was known as $\Theta ( n^3 )$. If the time taken for one calculation is $T(n)$ and the time taken outside of repetitions is $c$, then because it is $\displaystyle T(n) = 7 T \left( {{n} \over {2}} \right) + c$,

$$ \begin{align*} T(n) =& 7 T \left( {{n} \over {2}} \right) + c \\ =& 7 \left[ 7 T \left( {{n} \over {4}} \right) + c \right] + c \\ =& 7 \left[ 49 T \left( {{n} \over {8}} \right) + 7 c + c \right] + c \\ =& 7^3 T \left( {{n} \over {8}} \right) + (1+7+7^2)c \\ =& 7^3 T \left( {{n} \over {8}} \right) + {{7^3 - 1} \over {7 - 1}}c \\ & \vdots & \\ \approx& 7^{\log_{2} n} ( T(1) + c ) \\ =& n^{\log_{2} 7} ( T(1) + c ) \\ =& n^{2.807} ( T(1) + c ) \\ =& \Theta \left( n{^2.807} \right) \end{align*} $$