<< Chapter < Page Chapter >> Page >

Now, suppose we combine length p and q modules with Good's prime factor algorithm (not using twiddles). The following scaling procedure will work:

  • Assume the input data has been appropriately loaded into a p x q data array
  • Scale the non-DC outputs of the length p module and apply the modified module to all columns of the data array.
  • Now all the rows are scaled by ( p - 1 ) except the zeroeth row, corresponding to the DC outputs of the length p modules. Apply a normal length q module to the zeroeth row. Modify the length q module to scale by 1 / ( p - 1 ) and apply the modified version to all the other rows. The DFT is now complete.

As an example, consider the 3x7 DFT. In the length 3 module scaling the non-DC outputs trades one multiply for one add. When the scaled DFT is constructed, the modified length 3 module is used 7 times. But two rows must be scaled by modified length 7 modules, which brings the total multiply savings to 5 at a cost of 7 adds. This looks like a nice tradeoff. The total number of multiplies in a normal 3x7 PFA is 38.

These ideas can be expanded to multidimensional cases, although it quickly becomes difficult to keep track of which rows and columns need to be counter-scaled.

Length 5 DFT Algorithm R
Crossed Flow Graph
Equivalent Uncrossed Flow Graph
Length 5 DFT Algorithm A
Length 5 DFTAlgorithm B

Length 11 module: 168 adds / 40 mpys

  1. Use the index map x ¯ ( n ) = x ( < 8 n > m o d 11 ) to convert the DFT into a length 10 convolution, plus a correction term for the DC components.
  2. Reduce the length 10 convolution modulo all the irreducible factors of z 10 - 1
    m o d z 5 - 1 : T 1 , T 3 , T 2 , T 5 , T 4 m o d z 5 + 1 : T 6 , - T 8 , - T 7 , - T 10 , T 9
    from z 5 - 1 data
    m o d z - 1 : T 13 m o d z 5 - 1 / z - 1 : A M 4 , A M 7 , A M 3 , A M 6 ( a f t e r w e i g h t i n g )
    from z 5 + 1 data
    m o d z + 1 : A M 2 ( a f t e r w e i g h t i n g ) m o d z 5 + 1 / z + 1 : S 9 , S 11 , S 10 , S 12 ( a p p e a r s i n )
  3. Patch up the DC terms by adding the z - 1 reduction result to X ( I ( 1 ) ) and store the result in AMO.
  4. The z 5 - 1 convolution proceeds in four steps. First, do the irreducible factor reductions, then reduce further with an iterated Toom-Cook procedure, weight all remaining variables, and apply the transpose of the complete reduction stage to the weighted results. The first Toom-Cook reduction uses the factors z , 1 / z and z + 1 on the vectors AM4,AM3 and AM7,AM6 which generates the new vector AM4-AM7,AM3-AM6. Each of the original two vectors is then individually reduced using factors of z , 1 / z and z + 1 , while the new vector is reduced by A , 1 / z and z - 1 . This procedure generates nine variables: AM4,AM3,AM5; AM7,AM6,AM8; S7,S8,AM11. (The expressions for S6 and S8 contain the variables of interest).
  5. The nine variables from 4) are weighted along with T13.
  6. An exact transpose of the reduction algorithm is applied to the weighted variables (and AMO).
  7. The result S16,S15,S18,S17,S19 is the real part of the answer and is mapped back to the output using the map x ¯ ( n ) = x ( < 8 n + 1 > m o d 11 . This is an unusual map, but it is perfectly acceptable.
  8. A in the length 19 transform the z 5 + 1 convolution is computed with a variation of the z 5 - 1 algorithm. First the inputs T6,-T8,-T7,-T1O,T9 are alternately negated, then the z 5 - 1 algorithm is applied The second stage of the Toom-Cook reductions uses the factors z, liz and z+l for all three length two vectors. Also, the DC patch is not used here. and the outputs alternately negated.
  9. The result S21,S20,S23,S22,S24, representing the imaginary part of the answer, is mapped back to the output using the map x ¯ ( n ) = x ( < 8 n + 1 > m o d 11 ) .
  10. In both this algorithm and the length 13 DFT plus and minus signs have been freely altered to force all constants to be positive. Also, many shortcut computations were used to save adds, obscuring in some places the logical flow of the algorithm.
  11. All coefficients were computed using the author's QR decomposition linear equation solver and are accurate to at least 14 places.

Length 13 module: 188 adds / 40 mpys

  1. Use the index map x ¯ ( n ) = x ( < 2 n > m o d 13 ) to convert the DFT into a length 12 convolution, plus a correction term for the DC components.
  2. Reduce the length 12 convolution modulo all the irreducible factors of z 12 - 1
    m o d z 6 + 1 : A 7 , A 8 , A 9 , A 10 , A 11 , A 12 m o d z 6 - 1 : A 1 , A 2 , A 3 , A 4 , A 5 , A 6
    from z 6 - 1 data
    m o d z 2 - 1 : A 14 , A 13 m o d z 2 - z + 1 : A 23 , A 22 m o d z 2 + z + 1 : A 25 , A 24
    from z 2 - 1 data
    m o d z - 1 : A 15 m o d z + 1 : i m p l i c i t ( A 13 - A 14 )
    from z 6 + 1 data
    m o d z 2 + 1 : A 17 , A 16 m o d z 4 - z 2 + 1 : A 27 , A 26 , - A 31 , - A 30
  3. Patch up the DC terms by adding the z - 1 reduction result to X ( I ( 1 ) ) and store the result in AMO.
  4. The z 2 - z + 1 and z 2 + z + 1 convolutions are reduced using Toom-cook factors of z , 1 / z and z + 1 in one case and z , 1 / z and z - 1 in the other case, and then all the reduced quantities are weighted by constants generating new variables: from z 2 - z + 1
    z A M 7 1 / z A M 6 z - 1 A M 8
    from z 2 + z + 1
    z A M 10 1 / z A M 9 z + 1 A M 1 1
  5. The original m o d z + 1 reduction quantity is weighted and passed, along with AMO and the above six variables, to a reconstruction procedure which first combines the z - 1 and z 2 + z + 1 data to compute the convolution mod z 3 - 1 (CC4,CC5,CC6), and then combines the z + 1 and z 2 - z + 1 data to compute the convolution mod z 3 + 1 (CC1,CC2,CC3). These two vectors are combined to compute the complete z 6 - 1 output, which appears in permuted form in CC15 through CC20.
  6. The z 2 + 1 vector is decomposed with Toom-Cook factors of z , 1 / z and z + 1 yielding A17,A16 and the implicit term (A16+A17).
  7. The z 4 - z 2 + 1 vector is decomposed with a double iterated Toom-Cook scheme. First the vector is broken into two length two pieces: A27,A26 and A31,A30. Then the vectors are reduced by the factors of z , 1 / z and z + 1 operating on whole vectors to produce a set of three length two vectors: Ā27,A26A31,A30 A29,A28 = (A27+A31), (A26+A30)These vectors are not calculated in a straightforward manner. Each length two vector is further reduced, in the second iteration, by the factors z , 1 / z and z + 1 to create three new implicit variables ( A 27 + A 26 ) , ( A 31 + A 30 ) and ( A 29 + A 28 ) .
  8. The nine variables from [link] and the three variables from [link] are weighted by constants and the m o d z 6 + 1 reconstruction proceeds in an ad-hoc fashion which closely resembles a transposed tensor method, but has some differences. The add count for the reconstruction would have been the same if the transposed tensor method had been applied. The z 6 + 1 result appears in permuted form in variables CC21 through CC26.
  9. The final result is reconstructed from the z 6 - 1 and z 6 - 1 vectors. The DC term, x ( i ( 1 ) ) is set equal' to AMO.
  10. All coefficients were computed using the author's QR decomposition linear equation solver and are accurate to at least 14 places.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Large dft modules: 11, 13, 16, 17, 19, and 25. revised ece technical report 8105. OpenStax CNX. Sep 14, 2009 Download for free at http://cnx.org/content/col10569/1.7
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Large dft modules: 11, 13, 16, 17, 19, and 25. revised ece technical report 8105' conversation and receive update notifications?

Ask