<< Chapter < Page | Chapter >> Page > |
The use of FFT algorithms such as the radix-2 decimation-in-time or decimation-in-frequency methods result in tremendous savings in computations when computing the discrete Fourier transform . While most of the speed-up of FFTs comes from this,careful implementation can provide additional savings ranging from a few percent to several-fold increases in program speed.
The twiddle factor , or ${W}_{N}^{k}=e^{-(i\frac{2\pi k}{N})}$ , termsthat multiply the intermediate data in the FFT algorithms consist of cosines and sines that each take the equivalent of several multiplies to compute.However, at most $N$ unique twiddle factors can appear in any FFT or DFT algorithm.(For example, in the radix-2 decimation-in-time FFT , only $\frac{N}{2}$ twiddle factors $\forall k, k=\{0, 1, 2, \dots , \frac{N}{2}-1\}\colon {W}_{N}^{k}$ are used.) These twiddle factors can be precomputed once and storedin an array in computer memory, and accessed in the FFT algorithm by table lookup . This simple technique yields very substantial savings andis almost always used in practice.
On most computers, only some of the total computation time of an FFT is spent performing the FFT butterfly computations;determining indices, loading and storing data, computing loop parameters and other operations consume the majorityof cycles. Careful programming that allows the compiler to generateefficient code can make a several-fold improvement in the run-time of an FFT.The best choice of radix in terms of program speed may depend more on characteristics of the hardware (such as the number of CPU registers) orcompiler than on the exact number of computations. Very often the manufacturer's library codes are carefullycrafted by experts who know intimately both the hardware and compiler architecture and how to get the most performanceout of them, so use of well-written FFT libraries is generally recommended.Certain freely available programs and libraries are also very good.Perhaps the best current general-purpose library is the FFTW package; information can be found at (External Link) . A paper by Frigo and Johnson describes many of the key issues in developing compiler-friendly code.
While compilers continue to improve, FFT programs written directly in the assembly language of a specific machine are oftenseveral times faster than the best compiled code. This is particularly true for DSP microprocessors, which havespecial instructions for accelerating FFTs that compilers don't use. (I have myself seen differences of up to 26 to 1 in favor of assembly!)Very often, FFTs in the manufacturer's or high-performance third-party libraries are hand-coded in assembly.For DSP microprocessors, the codes developed by Meyer, Schuessler, and Schwarz are perhaps the best ever developed; while the particular processors are now obsolete, the techniquesremain equally relevant today. Most DSP processors provide special instructions and ahardware design favoring the radix-2 decimation-in-time algorithm, which is thus generally fastest on these machines.
Notification Switch
Would you like to follow the 'The dft, fft, and practical spectral analysis' conversation and receive update notifications?