<< Chapter < Page Chapter >> Page >

MachineLearning-Lecture13

Instructor (Andrew Ng) :Okay, good morning. For those of you actually online, sorry; starting a couple of minutes late. We’re having trouble with the lights just now, so we’re all sitting in the dark and they just came on. So welcome back, and what I want to do today is continue our discussions of the EM Algorithm, and in particular, I want to talk about the EM formulation that we derived in the previous lecture and apply it to the mixture of Gaussians model, apply it to a different model and a mixture of naive Bayes model, and then the launch part of today’s lecture will be on the factor analysis algorithm, which will also use the EM. And as part of that, we’ll actually take a brief digression to talk a little bit about sort of useful properties of Gaussian distributions.

So just to recap where we are. In the previous lecture, I started to talk about unsupervised learning, which was machine-learning problems, where you’re given an unlabeled training set comprising m examples here, right? And then – so the fact that there are no labels; that’s what makes this unsupervised or anything. So one problem that I talked about last time was what if you’re given a data set that looks like this and you want to model the density PFX from which you think the data had been drawn, and so with a data set like this, maybe you think was a mixture of two Gaussians and start to talk about an algorithm for fitting a mixture of Gaussians model, all right? And so we said that we would model the density of XP of X as sum over Z PFX given Z times P of Z where this later random variable meaning this hidden random variable Z indicates which of the two Gaussian distributions each of your data points came from and so we have, you know, Z was not a nomial with parameter phi and X conditions on a coming from the JAFE Gaussian was given by Gaussian of mean mu J and covariant sigma J, all right?

So, like I said at the beginning of the previous lecture, I just talked about a very specific algorithm that I sort of pulled out of the air for fitting the parameters of this model for finian, Francis, phi, mu and sigma, but then in the second half of the previous lecture I talked about what’s called the EM Algorithm in which our goal is that it’s a likelihood estimation of parameters. So we want to maximize in terms of theta, you know, the, sort of, usual right matter of log likelihood – well, parameterized by theta. And because we have a later random variable Z this is really maximizing in terms of theta, sum over I, sum over Z, P of XI, ZI parameterized by theta. Okay? So using Jensen’s inequality last time we worked out the EM Algorithm in which in the E step we would chose these probability distributions QI to the l posterior on Z given X and parameterized by theta and in the M step we would set theta to be the value that maximizes this. Okay? So these are the ones we worked out last time and the cartoon that I drew was that you have this long likelihood function L of theta that’s often hard to maximize and what the E step does is choose these probability distribution production QI’s. And in the cartoon, I drew what that corresponded to was finding a lower bounds for the log likelihood. And then horizontal access data and then the M step you maximize the lower boundary, right? So maybe you were here previously and so you jumped to the new point, the new maximum of this lower bound. Okay? And so this little curve here, right? This lower bound function here that’s really the right-hand side of that augments. Okay? So this whole thing in the augments. If you view this thing as a function of theta, this function of theta is a lower bounds for the log likelihood of theta and so the M step we maximize this lower bound and that corresponds to jumping to this new maximum to lower bound.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Machine learning. OpenStax CNX. Oct 14, 2013 Download for free at http://cnx.org/content/col11500/1.4
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Machine learning' conversation and receive update notifications?

Ask