<< Chapter < Page | Chapter >> Page > |
As was done for the other decimation-in-frequency algorithms, the input index map is used and the calculations aredone in place resulting in the output being in bit-reversed order. It is the three statements following label 30 that do the specialindexing required by the SRFFT. The last stage is length- 2 and, therefore, inappropriate for the standard L-shaped butterfly, so itis calculated separately in the DO 60 loop. This program is considered a one-butterfly version. A second butterfly can be addedjust before statement 40 to remove the unnecessary multiplications by unity. A third butterfly can be added to reduce the number ofreal multiplications from four to two for the complex multiplication when W has equal real and imaginary parts. It is also possible toreduce the arithmetic for the two- butterfly case and to reduce the data transfers by directly programming a length-4 and length-8butterfly to replace the last three stages. This is called a two-butterfly-plus version. Operation counts for the one, two,two-plus and three butterfly SRFFT programs are given in the next section. Some details can be found in [link] .
The special case of a SRFFT for real data and symmetric data is discussed in [link] . An application of the decimation-in-time SRFFT to real data is given in [link] . Application to convolution is made in [link] , to the discrete Hartley transform in [link] , [link] , to calculating the discrete cosine transform in [link] , and could be made to calculating number theoretic transforms.
An improvement in operation count has been reported by Johnson and Frigo [link] which involves a scaling of multiplying factors. The improvement is small but until this result, it wasgenerally thought the Split-Radix FFT was optimal for total floating point operation count.
The evaluation of any FFT algorithm starts with a count of the real (or floating point) arithmetic. [link] gives the number of real multiplications and additions required to calculate a length-NFFT of complex data. Results of programs with one, two, three and five butterflies are given to show the improvement that can beexpected from removing unnecessary multiplications and additions. Results of radices two, four, eight and sixteen for the Cooley-TukeyFFT as well as of the split-radix FFT are given to show the relative merits of the various structures. Comparisons of these data shouldbe made with the table of counts for the PFA and WFTA programs in The Prime Factor and Winograd Fourier Transform Algorithms: Evaluation of the PFA and WFTA . All programs use the four-multiply-two-add complex multiply algorithm. A similar table can be developed for thethree-multiply-three-add algorithm, but the relative results are the same.
From the table it is seen that a greater improvement is obtained going from radix-2 to 4 than from 4 to 8 or 16. This ispartly because length 2 and 4 butterflies have no multiplications while length 8, 16 and higher do. It is also seenthat going from one to two butterflies gives more improvement than going from two tohigher values. From an operation count point of view and from practical experience, a three butterfly radix-4 or a two butterflyradix-8 FFT is a good compromise. The radix-8 and 16 programs become long, especially with multiple butterflies, and they give a limitedchoice of transform length unless combined with some length 2 and 4 butterflies.
Notification Switch
Would you like to follow the 'Fast fourier transforms' conversation and receive update notifications?