<< Chapter < Page
  Speech synthesis   Page 1 / 1
Chapter >> Page >
This is an overview of the technique used to recompse an audio signal from one half of the DFT as well as shifting the harmonics in the signal.

Reconstructing the first half of the dft

With the first harmonic in hand (if, of course, it exists) the program is ready to manipulate the signal chunk by building a new DFT from scratch but based upon the original. The pitch you hear is the position of the fundamental frequency – the first harmonic. So the new DFT must take the frequencies at and around the original first harmonic and copy them, without alteration, to a spot further down the spectrum. Further, in fact, by exactly the desired pitch shift. The frequency of the second harmonic needs to be twice as large as the first so that the new voice sounds like it came from a real person, so the second harmonic and its neighboring frequencies are shifted twice as far down the spectrum as the first group. This is repeated with every harmonic in a similar way until half of the new DFT is full.

Reconstructing the second half of the dft

To reduce computational complexity, there is no need to perform this same task starting at the end of the original DFT working our way towards the middle. We know that the resulting time-domain signal we produce must be comprised of real numbers since people are going to actually listen to it, so we can exploit the symmetry properties of the DFT. That is, the DFT of a real-valued signal follows rule that the real part of the samples in the first half is a mirror image of the real part of the samples in the second half. Similarly, the imaginary part of the samples in the second half is a flipped (negative) mirror image of the imaginary part of the samples in the second half. This line simple for loop constructs the entire second half of the new DFT without any further analysis or computation. Here is an example of the magnitude of the spectrum for a chunk of signal before and after pitch manipulation. Notice how the harmonics are not merely shifted over, but spread out as well.

Dft of signal sample

Discrete Fourier Transform showing the spectra for one 512 sample chunk of the speech signal before manipulation by the Pitch Synthesizer.
Discrete Fourier Transform of the original DFT spectra for one 512 sample chunk of the speech signal after manipulation by the Pitch Synthesizer.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Speech synthesis. OpenStax CNX. Dec 18, 2004 Download for free at http://cnx.org/content/col10253/1.7
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Speech synthesis' conversation and receive update notifications?

Ask