2.2 Machine learning lecture 3 course notes (Page 4/12)

Machine learning

Page 4 / 12

\begin{matrix} {max}_{γ, w, b} & \frac{\hat{γ}}{| | w | |} \\ s.t. & y^{(i)} (w^{T} x^{(i)} + b) \geq \hat{γ}, i = 1, ..., m \end{matrix}

Here, we're going to maximize $\hat{γ} / | | w | |$ , subject to the functional margins all being at least $\hat{γ}$ . Since the geometric and functional margins are related by $γ = \hat{γ} / | | w | |$ , this will give us the answer we want. Moreover, we've gotten rid of the constraint $| | w | | = 1$ that we didn't like. The downside is that we now have a nasty (again, non-convex) objective $\frac{\hat{γ}}{| | w | |}$ function; and, we still don't have any off-the-shelf software that can solve this form of an optimization problem.

Let's keep going. Recall our earlier discussion that we can add an arbitrary scaling constraint on $w$ and $b$ without changing anything. This is the key idea we'll use now. We will introduce the scaling constraint that the functional margin of $w, b$ with respect to the training set must be 1:

\hat{γ} = 1 .

Since multiplying $w$ and $b$ by some constant results in the functional margin being multiplied by that same constant, this is indeed a scaling constraint, and can be satisfied by rescaling $w, b$ . Plugging this into our problem above, and noting that maximizing $\hat{γ} / | | w | |$ = $1 / | | w | |$ is the same thing as minimizing ${| | w | |}^{2}$ , we now have the following optimization problem:

\begin{matrix} {min}_{γ, w, b} & \frac{1}{2} {| | w | |}^{2} \\ s.t. & y^{(i)} (w^{T} x^{(i)} + b) \geq 1, i = 1, ..., m \end{matrix}

We've now transformed the problem into a form that can be efficiently solved. The above is an optimization problem with a convex quadratic objective and only linear constraints. Itssolution gives us the optimal margin classifier . This optimization problem can be solved using commercial quadratic programming (QP) code. You may be familiar with linear programming, which solves optimization problems that have linear objectives andlinear constraints. QP software is also widely available, which allows convex quadratic objectives and linear constraints.

While we could call the problem solved here, what we will instead do is make a digression to talk about Lagrange duality. This will lead us to our optimization problem'sdual form, which will play a key role in allowing us to use kernels to get optimal margin classifiers to work efficiently in very high dimensional spaces. The dual formwill also allow us to derive an efficient algorithm for solving the above optimization problem that will typically do much better than generic QP software.

Lagrange duality

Let's temporarily put aside SVMs and maximum margin classifiers, and talk about solving constrained optimization problems.

Consider a problem of the following form:

\begin{matrix} {min}_{w} & f (w) \\ s.t. & h_{i} (w) = 0, i = 1, ..., l . \end{matrix}

Some of you may recall how the method of Lagrange multipliers can be used to solve it. (Don't worry if you haven't seen it before.) In this method, we definethe Lagrangian to be

L (w, β) = f (w) + \sum_{i = 1}^{l} β_{i} h_{i} (w)

Here, the $β_{i}$ 's are called the Lagrange multipliers . We would then find and set $L$ 's partial derivatives to zero:

\frac{\partial L}{\partial w_{i}} = 0; \frac{\partial L}{\partial β_{i}} = 0,

and solve for $w$ and $β$ .

In this section, we will generalize this to constrained optimization problems in which we may have inequality as well as equality constraints. Due to time constraints, we won't really beable to do the theory of Lagrange duality justice in this class, Readers interested in learning more about this topic are encouraged to read, e.g., R. T. Rockarfeller (1970), Convex Analysis,Princeton University Press. but we will give the main ideas and results, which we will then apply to our optimal margin classifier's optimization problem.

<< Chapter < Page Page > Chapter >>

Read also:

Get Jobilize Job Search Mobile App in your pocket Now!

100% Free Mobile Applications
Receive real-time job alerts and never miss the right job again

Source: OpenStax, Machine learning. OpenStax CNX. Oct 14, 2013 Download for free at http://cnx.org/content/col11500/1.4

Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Machine learning' conversation and receive update notifications?

Ask

	Autonomic Nervous System By Marriyam Rana Start Quiz
	1 Week 1 Social Psych By Yacoub Jayoghli Start Quiz
	22 Biology 22 Prokaryotes Bacteria and Archaea MCQ By OpenStax Start Quiz
	5 BOD Reproductive System quiz By Brooke Delaney Start Quiz
	16 AP 16 Neurological Essay Exam By OpenStax Start Flashcards
	8 Neuroanatomy 08 The Vestibular System By Stephen Voron Start Quiz
	Deciduous Forest By Hope Percle Start Quiz
	19 Biology 19 The Evolution of Populations MCQ By OpenStax Start Quiz
	25 AP Key Terms 25 The Urinary System By OpenStax Start Key Terms
	Chemistry Practice Test By Sandhills MLT Start Test