<< Chapter < Page Chapter >> Page >

Independent components analysis

Our next topic is Independent Components Analysis (ICA). Similar to PCA, this will find a new basis in which to represent our data. However, the goal is verydifferent.

As a motivating example, consider the “cocktail party problem.” Here, n speakers are speaking simultaneously at a party, and any microphone placed in the room records only an overlapping combination of the n speakers' voices. But let's say we have n different microphones placed in the room, and because each microphone is a different distance from eachof the speakers, it records a different combination of the speakers' voices. Using these microphone recordings, can we separate out the original n speakers' speech signals?

To formalize this problem, we imagine that there is some data s R n that is generated via n independent sources. What we observe is

x = A s ,

where A is an unknown square matrix called the mixing matrix . Repeated observations gives us a dataset { x ( i ) ; i = 1 , ... , m } , and our goal is to recover the sources s ( i ) that had generated our data ( x ( i ) = A s ( i ) ).

In our cocktail party problem, s ( i ) is an n -dimensional vector, and s j ( i ) is the sound that speaker j was uttering at time i . Also, x ( i ) in an n -dimensional vector, and x j ( i ) is the acoustic reading recorded by microphone j at time i .

Let W = A - 1 be the unmixing matrix. Our goal is to find W , so that given our microphone recordings x ( i ) , we can recover the sources by computing s ( i ) = W x ( i ) . For notational convenience, we also let w i T denote the i -th row of W , so that

W = w 1 T w n T .

Thus, w i R n , and the j -th source can be recovered by computing s j ( i ) = w j T x ( i ) .

Ica ambiguities

To what degree can W = A - 1 be recovered? If we have no prior knowledge about the sources and the mixing matrix, it is not hard to see that there are some inherent ambiguities in A that are impossible to recover, given only the x ( i ) 's.

Specifically, let P be any n -by- n permutation matrix. This means that each row and each column of P has exactly one “1.” Here're some examples of permutation matrices:

P = 0 1 0 1 0 0 0 0 1 ; P = 0 1 1 0 ; P = 1 0 0 1 .

If z is a vector, then P z is another vector that's contains a permuted version of z 's coordinates. Given only the x ( i ) 's, there will be no way to distinguish between W and P W . Specifically, the permutation of the original sources is ambiguous, which should be no surprise. Fortunately, thisdoes not matter for most applications.

Further, there is no way to recover the correct scaling of the w i 's. For instance, if A were replaced with 2 A , and every s ( i ) were replaced with ( 0 . 5 ) s ( i ) , then our observed x ( i ) = 2 A · ( 0 . 5 ) s ( i ) would still be the same. More broadly, if a single column of A were scaled by a factor of α , and the corresponding source were scaled by a factor of 1 / α , then there is again no way, given only the x ( i ) 's to determine that this had happened. Thus, we cannot recover the “correct” scaling of the sources. However, for the applications that we are concernedwith—including the cocktail party problem—this ambiguity also does not matter. Specifically, scaling a speaker's speech signal s j ( i ) by some positive factor α affects only the volume of that speaker's speech. Also, sign changes do not matter, and s j ( i ) and - s j ( i ) sound identical when played on a speaker. Thus, if the w i found by an algorithm is scaled by any non-zero real number, the corresponding recovered source s i = w i T x will be scaled by the same factor; but this usually does not matter. (These comments also apply to ICA for the brain/MEGdata that we talked about in class.)

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Machine learning. OpenStax CNX. Oct 14, 2013 Download for free at http://cnx.org/content/col11500/1.4
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Machine learning' conversation and receive update notifications?

Ask