# 0.1 Formants and phonetics

Taking the FFT (Fast Fourier Transform) of each voice sample outputs its frequency spectrum. A formant is one of four highest peaks in a spectrum sample. From thefrequency spectrums, the main formants can be extracted. It is the location of these formants along the frequency axis that define avowel sound. There are four main peaks between 300 and 4400 Hz, this bandwidth is where the strongest formants for human speechoccur. For the purposes of this project, the group is to extract the frequency values of only the first two peaks since they providethe most information in terms of what the vowel sound is. Since all vowels follow constant and recognizable patterns in these twoformants, the changes along an accent can be recorded with a high degree of accuracy. Figure 1 shows this pattern between the vowel sounds and formant frequencies.

The first formant (F1) is dependant on whether a vowel sound is more open or closed, so on the chart, F1varies along the y axis. F1 increases in frequency as the vowel becomes more open and decreases to its minimum as the vowel soundcloses. The second formant (F2), however, follows along the x-axis. Thus, it varies depending on whether a sound is made in the frontor the back of the vocal cavity. F2 increases in frequency the farther forward that a vowel is and decreases to its minimum as avowel moves to the back. Therefore, each vowel sound has unique, characteristic formant values for its first two formants. With thisin mind, it means that theoretically, across many speakers, the same frequency values for the first two formant locations shouldhold as long as they are making the same vowel sound.

## Sample spectograms

