<< Chapter < Page Chapter >> Page >
Demonstrates the results of our vowel detection method

Our project produced largely successful results. We achieved flawless output for a variety of two syllable words that, as a whole, contained all of our database vowels. We were also successful with some three and four syllable words.

Result table
Input Output
Biblioteca CiCiCoCeCaC
Loteria CoCeuCiCaC
Mexico CeCiCoC
Santiago CaCiCaCoC
Santa Fe CaCa CeC
Cabo CaCoC
Dime CiCeC
Tito CiCoC
Papi CaCiC
Arturo CaCuCoC
Alejandro CaoCeCaCoC
Dame una camisa CaCe CiuCa CaCiCaC
Me gusta Rich B Ce CuCa CiCiC

C represents a string of 1, 2, or more non-vowels and a,e,i,o,and u are the actual vowels detected. Also, "Me gusta Rich B" had to be parsed together.
  • 'Biblioteca'and'Santiago'demonstrate superfluous consonant placement between vowels.
  • 'Una'illustrates difficulty in vowel detection because the second formant in the vowel sound was not present.
  • 'Loteria'and'Alejandro'demonstrate the errors caused by'R'and'L'respectively.

Problems

  • A relatively minor problem we encountered was the placement of consonants at the beginning and end of word, regardless of the beginning or ending sound being a consonant or vowel. A good example is the word Arturo, which begins and ends with a vowel sound, though our program returns a consonant at beginning and end. This is because of the dead space that is inherent at the start and end of file, due to the delay between recording beginning and the speech sample starting (and similarly at the end). The simplest way we could have amended this would have been to manually crop the files, so that no dead space was found.
  • Occasionally our vocal tract model did not sufficiently emphasize the second formant in'I'at a frequency far enough away from the third for there to be a peak at the frequency value we associated with the second one. As a result, the third formant was sometimes detected as the second. We never got this problem ironed out, and it caused confusion between I's and U's in our filter. A possible method of correcting this would be to apply a differentiator to adjacent frequency values of our frequency response. When the difference levels off or goes negative with a sufficiently high magnitude value, we could add that point as a formant peak. In the image below, one can visually tell that there is likely a peak around 1950 Hz, but there is no expressed peak, so our detection program passed over it.
    Example of loss of 2nd formant in Vowel 'I'
  • · L's and R's were occasionally detected as vowels. This is due to the fact that the pronunciation of them is little different from that of vowels; they primarily rely on resonant frequencies from the vocal tract rather than restriction of airflow as other consonants. Below are some frequency responses of the vocal tract as L's and R's were being pronounced. As you can tell, they are highly similar to the frequency response of the vocal tract when vowels are being produced. Without drastically changing the focus of our project, the only method to amend this would be to have more intricate threshold values.
    Consonant 'L' Vocal Model
    Consonant 'R' Vocal Model
  • · Often in direct transition from vowel to vowel with no consonant between, a consonant value was returned between the two vowels. This can be seen below in the three images showing the transition from the second I in biblioteca to the'O'. The first image is the'I', the third is the'O', and the second is the transition between them. The transitional frequency response is not sufficiently similar to either the'I'or'O', so it gets classified as a consonant. Currently anything that does not match one of our five vowels gets classified as a consonant. A possible means of circumventing this would be to add a transitional character to our database, in this case and'IO'database character. Or we could have direct consonant recognition (a broad class, not specific consonants) and then classify vowels that don't match our database as unknowns, rather than just pooling them with consonants.<i-bib.fig>,<between.fig>,<o-bib.fig>
    Second 'I' in Biblioteca
    Transition between vowels
    'O' in Biblioteca

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Ece 301 projects fall 2003. OpenStax CNX. Jan 22, 2004 Download for free at http://cnx.org/content/col10223/1.5
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Ece 301 projects fall 2003' conversation and receive update notifications?

Ask