This page is optimized for mobile devices, if you would prefer the desktop version just click here

0.4 5 - voice recognition: graphs

1. This time-domain signal shows clear segmentation between different numbers.

2．This shows a normalized spectrogram of the number “1”. Formants are visible but not clear.

3. Enhanced by a non-linear filter to emphasize the difference between peak values and background. Formants are much more distinct.

4. This shows the filtered spectrum in the Mel-scale, a logarithmic scale that models human hearing. Corner-frequency of 700 Hz used.

5. Weighted scatter plot with contours of the maximum likelihood GMM overlaid, showing the formants.

6. The GMM generated by the Mel-scale signal differs greatly from the linear-frequency version.

<< Chapter < Page Page > Chapter >>

OpenStax, Elec 301 project: voice recognition. OpenStax CNX. Dec 19, 2011 Download for free at http://cnx.org/content/col11396/1.3

Google Play and the Google Play logo are trademarks of Google Inc.