<< Chapter < Page | Chapter >> Page > |
Finally, we should mention that there are many FFTs entirely distinct from Cooley-Tukey. Three notable such algorithms are the prime-factor algorithm for $gcd({n}_{1},{n}_{2})=1$ [link] , along with Rader's [link] and Bluestein's [link] , [link] , [link] algorithms for prime $n$ . FFTW implements the first two in its codelet generator for hard-coded $n$ "Generating Small FFT Kernels" and the latter two for general prime $n$ (sections "Plans for prime sizes" and "Goals and Background of the FFTW Project" ). There is also the Winograd FFT [link] , [link] , [link] , [link] , which minimizes the number of multiplications at the expense of a largenumber of additions; this trade-off is not beneficial on current processors that have specialized hardware multipliers.
The FFTW project, begun in 1997 as a side project of the authors Frigo and Johnson as graduate students at MIT, has gone through several major revisions, and as of 2008 consists of more than 40,000 lines of code. It is difficult to measure the popularity of a free-software package, but (as of 2008) FFTW has been cited in over 500 academic papers, is used in hundreds of shipping free and proprietary software packages, and the authors have received over 10,000 emails from users of the software. Most of this chapter focuses on performance of FFT implementations, but FFTW would probably not be where it is today if that were the only consideration in its design. One of the key factors in FFTW's success seems to have been its flexibility in addition to its performance. In fact, FFTW is probably the most flexible DFT library available:
Our design philosophy has been to first define the most general reasonable functionality, and then to obtain the highest possibleperformance without sacrificing this generality. In this section, we offer a few thoughts about why such flexibility has proved important,and how it came about that FFTW was designed in this way.
FFTW's generality is partly a consequence of the fact the FFTW project was started in response to the needs of a real application forone of the authors (a spectral solver for Maxwell's equations [link] ), which from the beginning had to run on heterogeneous hardware. Our initial application requiredmulti-dimensional DFTs of three-component vector fields (magnetic fields in electromagnetism), and so right away this meant: (i)multi-dimensional FFTs; (ii) user-accessible loops of FFTs of discontiguous data; (iii) efficient support for non-power-of-two sizes(the factor of eight difference between $n\times n\times n$ and $2n\times 2n\times 2n$ was too much to tolerate); and (iv) saving a factor of two for the common real-input case was desirable. That is,the initial requirements already encompassed most of the features above, and nothing about this application is particularly unusual.
Notification Switch
Would you like to follow the 'Fast fourier transforms' conversation and receive update notifications?