<< Chapter < Page Chapter >> Page >
Two diagrams. Both are a chart showing a large circle labeled 8, then divided into two circles labeled 4, which are each divided into two circles labeled 2. The diagram on the left, labeled breadth-first, highlights the two circles labeled 4 in white. The diagram on the right, labeled depth-first, highlights one circle labeled 4 and one labeled 2.
Schematic of traditional breadth-first (left) vs. recursive depth-first (right) ordering for radix-2 FFT of size 8: thecomputations for each nested box are completed before doing anything else in the surrounding box. Breadth-first computation performs all butterfliesof a given size at once, while depth-first computation completes one subtransform entirely before moving on to the next (as in thealgorithm below).

One ordering distinction is between recursion and iteration. As expressed above, the Cooley-Tukey algorithm could be thought of asdefining a tree of smaller and smaller DFTs, as depicted in [link] ; for example, a textbook radix-2 algorithm would divide size n into two transforms of size n / 2 , which are divided into four transforms of size n / 4 , and so on until a base case is reached (in principle, size 1). This might naturally suggest a recursive implementation inwhich the tree is traversed “depth-first” as in [link] (right) and the algorithm of [link] —one size n / 2 transform is solved completely before processing the other one, and so on. However, most traditionalFFT implementations are non-recursive (with rare exceptions [link] ) and traverse the tree “breadth-first” [link] as in [link] (left)—in the radix-2 example, they would perform n (trivial) size-1 transforms, then n / 2 combinations into size-2 transforms, then n / 4 combinations into size-4 transforms, and so on, thus making log 2 n passes over the whole array. In contrast, as we discuss in "Discussion" , FFTW employs an explicitly recursive strategy that encompasses both depth-first and breadth-first styles, favoring the former since it has sometheoretical and practical advantages as discussed in "FFTs and the Memory Hierarchy" .

Y [ 0 , ... , n - 1 ] recfft 2 ( n , X , ι ) : IF n=1 THEN Y [ 0 ] X [ 0 ] ELSE Y [ 0 , ... , n / 2 - 1 ] recfft2 ( n / 2 , X , 2 ι ) Y [ n / 2 , ... , n - 1 ] recfft2 ( n / 2 , X + ι , 2 ι ) FOR k 1 = 0 TO ( n / 2 ) - 1 DO t Y [ k 1 ] Y [ k 1 ] t + ω n k 1 Y [ k 1 + n / 2 ] Y [ k 1 + n / 2 ] t - ω n k 1 Y [ k 1 + n / 2 ] END FOR END IF
A depth-first recursive radix-2 DIT Cooley-Tukey FFT to compute a DFT of a power-of-two size n = 2 m . The input is an array X of length n with stride ι (i.e., the inputs are X [ ι ] for = 0 , ... , n - 1 ) and the output is an array Y of length n (with stride 1), containing the DFT of X [Equation 1]. X + ι denotes the array beginning with X [ ι ] . This algorithm operates out-of-place, produces in-order output, and does not require aseparate bit-reversal stage.

A second ordering distinction lies in how the digit-reversal is performed. The classic approach is a single, separate digit-reversalpass following or preceding the arithmetic computations; this approach is so common and so deeply embedded into FFT lore that manypractitioners find it difficult to imagine an FFT without an explicit bit-reversal stage. Although this pass requires only O ( n ) time [link] , it can still be non-negligible, especially if the data is out-of-cache; moreover, it neglects the possibility that datareordering during the transform may improve memory locality. Perhaps the oldest alternative is the Stockham auto-sort FFT [link] , [link] , which transforms back and forth between two arrays with each butterfly, transposing one digit eachtime, and was popular to improve contiguity of access for vector computers [link] . Alternatively, an explicitly recursive style, as in FFTW, performs the digit-reversal implicitlyat the “leaves” of its computation when operating out-of-place (see section "Discussion" ). A simple example of this style, which computes in-order output using an out-of-place radix-2FFT without explicit bit-reversal, is shown in the algorithm of [link] [corresponding to [link] (right)]. To operate in-placewith O ( 1 ) scratch storage, one can interleave small matrix transpositions with thebutterflies [link] , [link] , [link] , [link] , and a related strategy in FFTW [link] is briefly described by "Discussion" .

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Fast fourier transforms. OpenStax CNX. Nov 18, 2012 Download for free at http://cnx.org/content/col10550/1.22
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Fast fourier transforms' conversation and receive update notifications?

Ask