<< Chapter < Page | Chapter >> Page > |
Okay. Any questions? Yeah?
Student: You are, I think here you try to measure the likelihood of your nice of theta by a fraction of error, but I think it’s that you measure because it depends on the family of theta too, for example. If you have a lot of parameters [inaudible] or fitting in?
Instructor (Andrew Ng) :Yeah, yeah. I mean, you’re asking about overfitting, whether this is a good model. I think let’s – the thing’s you’re mentioning are maybe deeper questions about learning algorithms that we’ll just come back to later, so don’t really want to get into that right now. Any more questions? Okay.
So this endows linear regression with a probabilistic interpretation. I’m actually going to use this probabil – use this, sort of, probabilistic interpretation in order to derive our next learning algorithm, which will be our first classification algorithm. Okay? So you’ll recall that I said that regression problems are where the variable Y that you’re trying to predict is continuous values. Now I’m actually gonna talk about our first classification problem, where the value Y you’re trying to predict will be discreet value. You can take on only a small number of discrete values and in this case I’ll talk about binding classification where Y takes on only two values, right? So you come up with classification problems if you’re trying to do, say, a medical diagnosis and try to decide based on some features that the patient has a disease or does not have a disease. Or if in the housing example, maybe you’re trying to decide will this house sell in the next six months or not and the answer is either yes or no. It’ll either be sold in the next six months or it won’t be. Other standing examples, if you want to build a spam filter. Is this e-mail spam or not? It’s yes or no. Or if you, you know, some of my colleagues sit in whether predicting whether a computer system will crash. So you have a learning algorithm to predict will this computing cluster crash over the next 24 hours? And, again, it’s a yes or no answer.
So there’s X, there’s Y. And in a classification problem Y takes on two values, zero and one. That’s it in binding the classification. So what can you do? Well, one thing you could do is take linear regression, as we’ve described it so far, and apply it to this problem, right? So you, you know, given this data set you can fit a straight line to it. Maybe you get that straight line, right? But this data set I’ve drawn, right? This is an amazingly easy classification problem. It’s pretty obvious to all of us that, right? The relationship between X and Y is – well, you just look at a value around here and it’s the right is one, it’s the left and Y is zero. So you apply linear regressions to this data set and you get a reasonable fit and you can then maybe take your linear regression hypothesis to this straight line and threshold it at 0.5. If you do that you’ll certainly get the right answer. You predict that if X is to the right of, sort of, the mid-point here then Y is one and then next to the left of that mid-point then Y is zero.
Notification Switch
Would you like to follow the 'Machine learning' conversation and receive update notifications?