This page is optimized for mobile devices, if you would prefer the desktop version just
click here
1. This time-domain signal shows clear segmentation between different numbers.
2.This shows a normalized spectrogram of the number “1”. Formants are visible but not clear.
3. Enhanced by a non-linear filter to emphasize the difference between peak values and background. Formants are much more distinct.
4. This shows the filtered spectrum in the Mel-scale, a logarithmic scale that models human hearing. Corner-frequency of 700 Hz used.
5. Weighted scatter plot with contours of the maximum likelihood GMM overlaid, showing the formants.
6. The GMM generated by the Mel-scale signal differs greatly from the linear-frequency version.
Read also:
OpenStax, Elec 301 project: voice recognition. OpenStax CNX. Dec 19, 2011 Download for free at http://cnx.org/content/col11396/1.3
Google Play and the Google Play logo are trademarks of Google Inc.