The Fast Fourier Transform Algorithm
Overview1
The Discrete Fourier Transform (DFT), when computed naively following its mathematical definition, has a time complexity of $\mathcal{O}(N^{2})$. However, by using the algorithm described below, the time complexity can be reduced to $\mathcal{O}(N\log_{2}N)$. This efficient computation method of the Discrete Fourier Transform is known as the Fast Fourier Transform (FFT).
Buildup
Let’s define multiplying two numbers and then adding them to another number as one operation. To compute the value of $\sum\limits_{i=0}^{n-1}a_{n}b_{n}$, $n$ operations are needed.
$$ \begin{align*} \sum\limits_{n=0}^{0} a_{n}b_{b} &= a_{0}b_{0} = \overbrace{0 {\color{red} +} a_{0} {\color{red} \times} b_{0}}^{\color{red} 1 \text{ operation}} \\ \sum\limits_{n=0}^{1} a_{n}b_{b} &= a_{0}b_{0} + a_{1}b_{1} = \overbrace{\big( \overbrace{0 {\color{red} +} a_{0} {\color{red} \times} b_{0}}^{\color{red} 1 \text{ operation}} \big) {\color{#5882FA} + } a_{1} {\color{#5882FA} \times} b_{1}}^{\color{#5882FA}2 \text{ operations}} \\ \sum\limits_{n=0}^{2} a_{n}b_{b} &= a_{0}b_{0} + a_{1}b_{1} + a_{2}b_{2} = \overbrace{\bigg( \overbrace{\big( \overbrace{0 {\color{red} +} a_{0} {\color{red} \times} b_{0}}^{\color{red} 1 \text{ operation}} \big) {\color{#5882FA} + } a_{1} {\color{#5882FA} \times} b_{1}}^{\color{#5882FA}2 \text{ operations}} \bigg) {\color{#FE9A2E} + } a_{2} {\color{#FE9A2E} \times} b_{2}}^{\color{#FE9A2E}3 \text{ operations}} \\ \end{align*} $$
Now, recall the definition of the Discrete Fourier Transform.
The linear transformation $\mathcal{F}_{N} : \mathbb{C}^{N} \to \mathbb{C}^{N}$ is called the Discrete Fourier Transform.
$$ \mathcal{F}_{N}(\mathbf{a}) = \hat{\mathbf{a}} = \begin{bmatrix} \hat{a}_{0} \\ \hat{a}_{1} \\ \dots \\ \hat{a}_{N-1} \end{bmatrix} ,\quad \hat{a}_{m} = \sum_{n=0}^{N-1}e^{-i2\pi mn /N} a_{n}\quad (0\le m < N) \tag{1} $$
Where $\mathbf{a} = \begin{bmatrix} a_{0}& a_{1}& \dots& a_{N-1} \end{bmatrix}^{T}$.
To compute $\hat{a}_{m}$, $N$ operations are needed, and to compute $\hat{\mathbf{a}}$, this must be performed $N$ times. Therefore, the total number of operations needed for the Discrete Fourier Transform is $N^{2}$, which means it has a time complexity of $\mathcal{O}(N^{2})$. This implies a significant computational cost for Fourier Transforms from a computer calculation perspective.
Algorithm
Let’s assume the length of the data, $N$, is a composite number $N = N_{1}N_{2}$. Now, define the indices $m, n$ as follows.
$$ m = m^{\prime}N_{1} + m^{\prime \prime},\quad n = n^{\prime}N_{2} + n^{\prime \prime} $$
Then, $0 \le m^{\prime}, n^{\prime \prime} \le N_{2}-1$, and $0 \le m^{\prime \prime}, n^{\prime} \le N_{1}-1$. Let’s express the exponent part of $(1)$ as follows.
$$ \begin{align*} e^{-i2\pi mn /N} &= e^{-i2\pi (m^{\prime}N_{1} + m^{\prime \prime})(n^{\prime}N_{2} + n^{\prime \prime})/(N_{1}N_{2})} \\ &= e^{-i2\pi (m^{\prime}n^{\prime}N_{1}N_{2} + m^{\prime}n^{\prime \prime}N_{1} + m^{\prime \prime}n^{\prime}N_{2} + m^{\prime \prime}n^{\prime \prime})/(N_{1}N_{2})} \\ &= e^{-i2\pi m^{\prime}n^{\prime}} e^{-i2\pi \big( (m^{\prime}n^{\prime \prime}/N_{2}) + (m^{\prime \prime}n^{\prime}/N_{1}) + (m^{\prime \prime}n^{\prime \prime}/N) \big)} \\ &= e^{-i2\pi \big( (m^{\prime}n^{\prime \prime}/N_{2}) + (m^{\prime \prime}n^{\prime}/N_{1}) + (m^{\prime \prime}n^{\prime \prime}/N) \big)} \\ \end{align*} $$
Substituting this into $(1)$ gives,
$$ \begin{align*} \hat{a}_{m} &= \sum_{n=0}^{N-1}e^{-i2\pi mn /N} a_{n} \\ &= \sum_{n^{\prime \prime}=0}^{N_{2}-1} \sum_{n^{\prime}=0}^{N_{1}-1} e^{-i2\pi \big( (m^{\prime}n^{\prime \prime}/N_{2}) + (m^{\prime \prime}n^{\prime}/N_{1}) + (m^{\prime \prime}n^{\prime \prime}/N) \big)}a_{n^{\prime}N_{2}+n^{\prime \prime}} \\ &= \sum_{n^{\prime \prime}=0}^{N_{2}-1} \left[ \sum_{n^{\prime}=0}^{N_{1}-1} e^{-i2\pi m^{\prime \prime}n^{\prime}/N_{1}}a_{n^{\prime}N_{2}+n^{\prime \prime}} \right] e^{-i2\pi [ (m^{\prime}n^{\prime \prime}/N_{2}) + (m^{\prime \prime}n^{\prime \prime}/N) ] } \end{align*} $$
According to the equation above, calculating $\left[ \sum_{n^{\prime}=0}^{N_{1}-1} \right]$ inside each bracket requires $N_{1}$ operations, and computing $\sum_{n^{\prime \prime}=0}^{N_{2}-1}$ outside each bracket requires $N_{2}$ operations. Therefore, to calculate $\hat{a}_{m}$, a total of $(N_{1} + N_{2})$ operations are needed. To obtain $\hat{\mathbf{a}}$, this must be repeated $N$ times, resulting in a total cost of $N(N_{1} + N_{2})$, which is a reduction from $N^{2}$.
Upon closer inspection of the brackets, it is apparent that if $N_{1}$ is again a composite number, the same logic can be applied. Generally, when the length of the data is a composite number $N = N_{1} N_{2} \cdots N_{k}$, the time complexity is reduced as follows.
$$ \mathcal{O}(N^{2}) \searrow \mathcal{O}\big( N(N_{1} + N_{2} + \cdots + N_{k}) \big) $$
Now, let’s assume $N$ is a power of $2$, $N = 2^{k}$. Then $\log_{2}N = k$, and from $N^{2} = 2^{k}$, it is reduced to $2^{k}(2k)$, hence the time complexity is reduced as follows.
$$ \mathcal{O}(N^{2}) \searrow \mathcal{O}(2N \log_{2}N) $$
Aside
This method was proposed in 1965 by Cooley and Tukey2, hence it is also called the Cooley-Tukey algorithm. However, they were not the first to invent it. Gauss also researched a similar algorithm but did not publish it properly, so this fact became known later3.
Gerald B. Folland, Fourier Analysis and Its Applications (1992), p252-253 ↩︎
J. W. Cooley and J. W. Tukey, An algorithm for the machine calculation of complex Fourier series, Mathematics of Computation 19 (1965), 297-301. ↩︎
M. T. Heideman, D. H. Johnson, and C. S. Burms, Gauss and the history of the fast Fourier transform, Archive for the History of the Exact Sciences 34 (1985), 264-277. ↩︎