<< Chapter < Page Chapter >> Page >

Plans for higher vector ranks

These plans extract a vector loop to reduce a DFT problem to a problem of lower vector rank, which is then solved recursively. Anyof the vector loops of V could be extracted in this way, leading to a number of possible plans corresponding to different looporderings.

Formally, to solve dft ( N , V , I , O ) , where V = ( n , ι , o ) V 1 , FFTW generates a loop that, for all k such that 0 k < n , invokes a plan for dft ( N , V 1 , I + k · ι , O + k · o ) .

Indirect plans

Indirect plans transform a DFT problem that requires some data shuffling (or discontiguous operation) into a problem that requires noshuffling plus a rank-0 problem that performs the shuffling.

Formally, to solve dft ( N , V , I , O ) where N > 0 , FFTW generates a plan that first solves dft ( , N V , I , O ) , and then solves dft ( copy-o ( N ) , copy-o ( V ) , O , O ) . Here we define copy-o ( t ) to be the I/O tensor ( n , o , o ) ( n , ι , o ) t : that is, it replaces the input strides with the output strides. Thus, an indirect plan firstrearranges/copies the data to the output, then solves the problem in place.

Plans for prime sizes

As discussed in "Goals and Background of the FFTW Project" , it turns out to be surprisingly useful to be able to handle large prime n (or large prime factors). Rader plans implement the algorithm from [link] to compute one-dimensional DFTs of prime size in Θ ( n log n ) time. Bluestein plans implement Bluestein's “chirp-z” algorithm, which can also handle prime n in Θ ( n log n ) time [link] , [link] , [link] . Generic plans implement a naive Θ ( n 2 ) algorithm (useful for n 100 ).

Discussion

Although it may not be immediately apparent, the combination of the recursive rules in "The space of plans in FFTW" can produce a number of useful algorithms. To illustrate these compositions, we discuss three particular issues: depth- vs. breadth-first, loop reordering,and in-place transforms.

Two descriptions of size-30 DFT, one depth-first, and one breadth-first.
Two possible decompositions for a size-30 DFT, both for the arbitrary choice of DIT radices 3 then 2 then 5, and prime-sizecodelets. Items grouped by a " { " result from the plan for a single sub-problem. In the depth-first case, the vector rank wasreduced to zero as per "Plans for higher vector ranks" before decomposing sub-problems, and vice-versa in the breadth-first case.

As discussed previously in sections "Review of the Cooley-Tukey FFT" and "Understanding FFTs with an ideal cache" , the same Cooley-Tukey decomposition can be executed in either traditionalbreadth-first order or in recursive depth-first order, where the latter has some theoretical cache advantages. FFTW is explicitlyrecursive, and thus it can naturally employ a depth-first order. Because its sub-problems contain a vector loop that can be executed ina variety of orders, however, FFTW can also employ breadth-first traversal. In particular, a 1d algorithm resembling thetraditional breadth-first Cooley-Tukey would result from applying "Cooley-Tukey plans" to completely factorize the problem size before applying the loop rule "Plans for higher vector ranks" to reduce the vector ranks, whereas depth-first traversal would result fromapplying the loop rule before factorizing each subtransform. These two possibilities are illustrated by an example in [link] .

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Fast fourier transforms. OpenStax CNX. Nov 18, 2012 Download for free at http://cnx.org/content/col10550/1.22
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Fast fourier transforms' conversation and receive update notifications?

Ask