<< Chapter < Page | Chapter >> Page > |
The mixed-radix approach utilizes clever subsampling of $x\left[n\right]$ and permutations of twiddle factor matrices to lower operation counts. The first step is to compute the prime factorization of the signal length $N$ . Prime factors of 2 and 3 are then combined to create as many factors of 4 and 6 as possible. The resulting prime factorization has the form $N={\prod}_{i=1}^{F}{f}_{i}$ . We can then perform a single step ${T}_{i}$ of the algorithm by computing $N/{f}_{i}$ FFTs of length ${f}_{i}$ .
Four unique matrix constructions are necessary to generalize a single step of the algorithm. The first is the identity matrix ${I}_{r}\in {\mathbb{R}}^{r\times r}$ . The second is the permutation matrix ${P}_{b}^{a}\in {\mathbb{R}}^{ab\times ab}$ , where
Note that ${P}_{b}^{a}$ swaps positions $ra+s$ and $sb+r$ in a vector. The third matrix to consider is a diagonal matrix of twiddle factors ${D}_{b}^{a}\in {\mathbb{R}}^{ab\times ab}$ , where
with ${\omega}_{ab}={e}^{-j2\pi /\left(ab\right)}$ . The fourth and final matrix construction to consider is the standard DFT matrix ${W}_{n}\in {\mathbb{R}}^{n\times n}$ .
The mixed-radix algorithm requires the interaction of operations on the subspaces ${\mathbb{R}}^{i},\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}i=1,...,F$ , and so it is necessary to consider the Kronecker Product in calculations. The Kronecker Product is an operation defined as $\otimes :{\mathbb{R}}^{m\times n}\times {\mathbb{R}}^{p\times q}\to {\mathbb{R}}^{mp\times nq}$ . If $C=A\otimes B$ , then
Using these matrices, it possible to compose a single steps ${T}_{i}$ using the following equation [link] :
where
Note that ${T}_{i}\in {\mathbb{R}}^{N\times N}$ .
To implement the algorithm, each stage is applied iteratively to acquire $X$ . ${T}_{i}$ may be simplified into two separate steps using the definitions of each matrix to yield the following algorithm structure:
The DFT module in the above code utilizes the Winograd Short-Length Transforms to minimize operations when computing DFTs of length less than 6. For all other lengths, Rader's FFT is used.
Rader's FFT for prime lengths exploits results from number theory to express the DFT as a composite-length cyclic convolution [link] . Any prime number $p$ defines a multiplicative group modulo $p$ , denoted ${\mathbb{Z}}_{n}^{\times}=\left\{0,,,1,,,2,,,.,.,.,,,p,-,1\right\}$ . This group is cyclic, i.e., $\forall a\in {\mathbb{Z}}_{p}^{\times}\exists g\in {\mathbb{N}}_{p-2}^{0}:a={g}^{i}={g}^{-j},\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}0\le i,j\le p-2$ , where $g$ is known as the primitive root modulo $\mathit{p}$ and ${\mathbb{N}}_{p-2}^{0}=\left\{0,,,1,,,2,,,.,.,.,,,p,-,2\right\}$ . In practice, there is no general formula for finding $g$ for $p$ , but in this implementation $N$ is known and therefore $g$ may be stored in a lookup table.
The general form of the DFT of length $p$ is given by
Since the twiddle factor, ${\omega}_{p}^{nk}$ , is naturally computed modulo $p$ and both $n$ and $k$ range from $1,2,...,p-1$ , we can rewrite the above formula using $g$ :
This formulation is of the form of a cyclic convolution of two sequences $\alpha $ and $\beta $ where ${\alpha}_{q}={h}_{{g}^{q}}$ and ${\beta}_{q}={\omega}_{p}^{{g}^{-q}}$ of length $p-1$ . This convolution is computed via the convolution theorem:
For speed, the sequence $\alpha $ is zero-padded between its zeroth and first index to a length $M$ which is a power of 2 such that $M\ge 2p-3$ and $\beta $ is cyclically repeated to be length $M$ . Since $N$ is known, it is possible to store the DFT of the sequence $\beta $ in a lookup table. The DFT of $\alpha $ is then computed using the standard radix-2 Cooley-Tukey algorithm. The two DFTs are multiplied pointwise and then the inverse DFT is again calculated using the standard Cooley-Tukey algorithm. The first $p$ elements are then the result of the cyclic convolution. The final result is attained after adding back the DC offset.
For small numerical examples demonstrating this implementation of Rader's FFT, see this excellent guide here .
Notification Switch
Would you like to follow the 'Using ffast to decrease computation time in digital multitone communication' conversation and receive update notifications?