<< Chapter < Page Chapter >> Page >
User Manual for Speech Analysis Project

Iii – user manual

The software for this project was written in Matlab. The main codes is voiceRecognition.m. The prototype for the voiceRecognition functions is as follows:

trueMatch = voiceRecognition( username, pin, thresh,

candidateName )

It is assumed that each user has a username. In the files packaged with this report, the two users present in the database are ‘Nicholas’ and ‘Andrew’ Andrew is Nicholas’ roommate and was kind enough to allow himself to be recorded for the purposes of this project. . The username parameter for the voiceRecognition is a 1D character array with characters equal to the username.

The pin parameter represents the user’s PIN. It is a four element 1D array containing integer values 0 – 9.

thresh is an optional parameter. There is a threshold associated with the final matching algorithm (discussed later). By default, this threshold is set to 0.48; but it can be changed by passing a double value between 0 and 1 into this parameter.

candidateName is an optional parameter. The candidate is the person whose recording will be compared to the username’s database. In practice, the candidate name and the username would always be the same. However, for testing purposes, it was convenient to be able to specify a different candidate – e.g. we would specify the username as ‘Nicholas’ and the candidate as ‘Andrew’ to verify that the software did indeed detect an impostor.

(3.1) shows an example of a voice recognition function call.

voiceRecognition( 'Nicholas' , [0,1,2,3] ); (3.1)

Directory structure

The voiceRecognition software assumes a specific directory structure. There is a directory called ‘recordings’ in the same location as the Matlab working directory. Within the recordings directory are two directories: ‘current’ and ‘person’. The person directory contains the database of users who have previously entered their data into the system – i.e. it is the database of valid users. The current directory contains recordings of candidates.

Within the person directory, each user is granted his/her own directory. For example, there is a directory present named ‘Nicholas’. Within ‘Nicholas’ there should be seven wav file recordings of each pin number. For example, if the pin were 1-8-1-9, then there should be seven recordings of “one”, “eight”, and “nine”. The recordings must be labeled num 1.wav through num 7.wav.

Within the current directory, there again should be a directory for each candidate that would be presented to the software. In our case, we present no other candidates than those included in the user database. Within this directory, there must be recordings for all PIN numbers. In the above example, there should be present the files one.wav, eight.wav, and nine.wav.

Assumptions

There can be a significant amount of variability in the frequencies that a person uses to generate a word. The assumption made in this software is that the person generates the word attempting to use the same pitch and speed as previously conducted when submitting a phrase for comparison.

We now go on to describe the algorithms implemented in the software.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Analysis of speech signal spectrums using the l2 norm. OpenStax CNX. Dec 12, 2009 Download for free at http://cnx.org/content/col11143/1.2
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Analysis of speech signal spectrums using the l2 norm' conversation and receive update notifications?

Ask