2.9 Machine learning lecture 9 course notes

Machine learning Page 1 / 3

Factor analysis

When we have data $x^{(i)} \in R^{n}$ that comes from a mixture of several Gaussians, the EM algorithm can be applied to fit a mixture model. In thissetting, we usually imagine problems where we have sufficient data to be able to discern the multiple-Gaussian structure in the data. For instance,this would be the case if our training set size $m$ was significantly larger than the dimension $n$ of the data.

Now, consider a setting in which $n ≫ m$ . In such a problem, it might be difficult to modelthe data even with a single Gaussian, much less a mixture of Gaussian. Specifically, since the $m$ data points span only a low-dimensional subspace of $R^{n}$ , if we model the data as Gaussian, and estimate the mean and covariance using the usualmaximum likelihood estimators,

\begin{matrix} μ & = & \frac{1}{m} \sum_{i = 1}^{m} x^{(i)} \\ Σ & = & \frac{1}{m} \sum_{i = 1}^{m} (x^{(i)} - μ) {(x^{(i)} - μ)}^{T}, \end{matrix}

we would find that the matrix $Σ$ is singular. This means that $Σ^{- 1}$ does not exist, and $1 / {| Σ |}^{1 / 2} = 1 / 0$ . But both of these terms are needed in computing the usual density of a multivariate Gaussian distribution. Another wayof stating this difficulty is that maximum likelihood estimates of the parameters result in a Gaussian that places all of its probability in the affine space spannedby the data, This is the set of points $x$ satisfying $x = \sum_{i = 1}^{m} α_{i} x^{(i)}$ , for some $α_{i}$ 's so that $\sum_{i = 1}^{m} α_{1} = 1$ . and this corresponds to a singular covariance matrix.

More generally, unless $m$ exceeds $n$ by some reasonable amount, the maximum likelihood estimates of the mean and covariance may be quite poor. Nonetheless, we would still like tobe able to fit a reasonable Gaussian model to the data, and perhaps capture some interesting covariance structure in the data. How can we do this?

In the next section, we begin by reviewing two possible restrictions on $Σ$ , ones that allow us to fit $Σ$ with small amounts of data but neither of which will give a satisfactory solution to our problem. We next discuss someproperties of Gaussians that will be needed later; specifically, how to find marginal and conditonal distributions of Gaussians. Finally, we present the factor analysis model,and EM for it.

Restrictions of $Σ$

If we do not have sufficient data to fit a full covariance matrix, we may place some restrictions on the space of matrices $Σ$ that we will consider. For instance, we maychoose to fit a covariance matrix $Σ$ that is diagonal. In this setting, the reader may easily verify that the maximum likelihood estimate of the covariance matrix is given by thediagonal matrix $Σ$ satisfying

Σ_{j j} = \frac{1}{m} \sum_{i = 1}^{m} {(x_{j}^{(i)} - μ_{j})}^{2} .

Thus, $Σ_{j j}$ is just the empirical estimate of the variance of the $j$ -th coordinate of the data.

Recall that the contours of a Gaussian density are ellipses. A diagonal $Σ$ corresponds to a Gaussian where the major axes of these ellipses are axis-aligned.

Sometimes, we may place a further restriction on the covariance matrix that not only must it be diagonal, but its diagonal entries must all be equal. In this setting,we have $Σ = σ^{2} I$ , where $σ^{2}$ is the parameter under our control. The maximum likelihood estimate of $σ^{2}$ can be found to be:

σ^{2} = \frac{1}{m n} \sum_{j = 1}^{n} \sum_{i = 1}^{m} {(x_{j}^{(i)} - μ_{j})}^{2} .

This model corresponds to using Gaussians whose densities have contours that are circles (in 2 dimensions; or spheres/hyperspheres in higher dimensions).

<< Chapter < Page Page > Chapter >>

Read also:

Get Jobilize Job Search Mobile App in your pocket Now!

100% Free Mobile Applications
Receive real-time job alerts and never miss the right job again

Source: OpenStax, Machine learning. OpenStax CNX. Oct 14, 2013 Download for free at http://cnx.org/content/col11500/1.4

Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Machine learning' conversation and receive update notifications?

Ask

	3 BOD GI quiz By Brooke Delaney Start Exam
©flickr: Gareth	Professional Etiquette MCQ By Abby Sharp Start Quiz
	33 Biology 33 The Animal Body Basic Form Function By OpenStax Start Quiz
	Neuropsychology Midterm By Kimberly Nichols Start Test
	How to Analyze Stocks By Yasser Ibrahim Start Quiz
	Vocabulary for "A Rose for Emily" By Bonnie Hurst Start Quiz
	Microeconomics Test 1 By IES Portal Start Test
	1 Gastrointestinal Pathophysiology By Laurence Bailen Start Exam
©flickr: Francisco	U.S. Civil War Pre-test By Danielle Stephens Start Quiz
	40 Biology 40 The Circulatory System MCQ By OpenStax Start Quiz

2.9 Machine learning lecture 9 course notes

Factor analysis

Restrictions of Σ

Read also:

Get Jobilize Job Search Mobile App in your pocket Now!

Restrictions of $Σ$