<< Chapter < Page Chapter >> Page >

Thus, if we want to update some subject of the α i 's, we must update at least two of them simultaneously in order to keep satisfying the constraints. This motivatesthe SMO algorithm, which simply does the following:

  • Repeat till convergence {
    1. Select some pair α i and α j to update next (using a heuristic that tries to pick the two that will allow us to make the biggest progress towards the global maximum).
    2. Reoptimize W ( α ) with respect to α i and α j , while holding all the other α k 's ( k i , j ) fixed.
  • }

To test for convergence of this algorithm, we can check whether the KKT conditions ( [link] ) are satisfied to within some t o l . Here, t o l is the convergence tolerance parameter, and is typically set to around 0.01 to 0.001. (See the paper and pseudocode for details.)

The key reason that SMO is an efficient algorithm is that the update to α i , α j can be computed very efficiently. Let's now briefly sketch the main ideas for deriving the efficient update.

Let's say we currently have some setting of the α i 's that satisfy the constraints in [link] , and suppose we've decided to hold α 3 , ... , α m fixed, and want to reoptimize W ( α 1 , α 2 , ... , α m ) with respect to α 1 and α 2 (subject to the constraints). From the final equation in [link] , we require that

α 1 y ( 1 ) + α 2 y ( 2 ) = - i = 3 m α i y ( i ) .

Since the right hand side is fixed (as we've fixed α 3 , ... α m ), we can just let it be denoted by some constant ζ :

α 1 y ( 1 ) + α 2 y ( 2 ) = ζ .

We can thus picture the constraints on α 1 and α 2 as follows:

a graph with the above equation

From the constraints  [link] , we know that α 1 and α 2 must lie within the box [ 0 , C ] × [ 0 , C ] shown. Also plotted is the line α 1 y ( 1 ) + α 2 y ( 2 ) = ζ , on which we know α 1 and α 2 must lie. Note also that, from these constraints, we know L α 2 H ; otherwise, ( α 1 , α 2 ) can't simultaneously satisfy both the box and the straight line constraint. In this example, L = 0 . But depending on what the line α 1 y ( 1 ) + α 2 y ( 2 ) = ζ looks like, this won't always necessarily be the case; but more generally, there will besome lower-bound L and some upper-bound H on the permissable values for α 2 that will ensure that α 1 , α 2 lie within the box [ 0 , C ] × [ 0 , C ] .

Using Equation  [link] , we can also write α 1 as a function of α 2 :

α 1 = ( ζ - α 2 y ( 2 ) ) y ( 1 ) .

(Check this derivation yourself; we again used the fact that y ( 1 ) { - 1 , 1 } so that ( y ( 1 ) ) 2 = 1 .) Hence, the objective W ( α ) can be written

W ( α 1 , α 2 , ... , α m ) = W ( ( ζ - α 2 y ( 2 ) ) y ( 1 ) , α 2 , ... , α m ) .

Treating α 3 , ... , α m as constants, you should be able to verify that this is just some quadratic function in α 2 . I.e., this can also be expressed in the form a α 2 2 + b α 2 + c for some appropriate a , b , and c . If we ignore the “box” constraints  [link] (or, equivalently, that L α 2 H ), then we can easily maximize this quadratic function by setting its derivative to zero and solving. We'll let α 2 n e w , u n c l i p p e d denote the resulting value of α 2 . You should also be able to convince yourself that if we had instead wanted to maximize W with respect to α 2 but subject to the box constraint, then we can find the resulting value optimal simply by taking α 2 n e w , u n c l i p p e d and “clipping” it to lie in the [ L , H ] interval, to get

α 2 n e w = H if α 2 new, unclipped > H α 2 new, unclipped L if α 2 new, unclipped H L if α 2 new, unclipped < L

Finally, having found the α 2 n e w , we can use Equation  [link] to go back and find the optimal value of α 1 n e w .

There're a couple more details that are quite easy but that we'll leave you to read about yourself in Platt's paper: One is the choice of the heuristics usedto select the next α i , α j to update; the other is how to update b as the SMO algorithm is run.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Machine learning. OpenStax CNX. Oct 14, 2013 Download for free at http://cnx.org/content/col11500/1.4
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Machine learning' conversation and receive update notifications?

Ask