<< Chapter < Page Chapter >> Page >
A brief discussion of PCA.

Principal component analysis

PCA is essentially just SVD. The only difference is that we usually center the data first using some grand mean before doing SVD. There are three perspectives of views for PCA. Each of them gives different insight on what PCA does.

Low-rank approximation

min Z 1 2 | | X Z | | F 2 s u b j e c t t o r a n k ( Z ) K

where Frobenius norm is a matrix version of sums of squared. This gives the interpretation of dimension reduction. Solution to the problem is: Z = i = 1 K U k d k V K T

We do lose some information when doing dimension reduction, but the majority of variance is explained in the lower-rank matrix (The eigenvalues give us information about how significant the eigenvector is. So we put the eigenvalues in the order of the magnitude of the eigenvectors, and discard the smallest several since the contribution of components along that particular eigenvector is less significant comparing that with a large eigenvalue). PCA guarantees the best rank-K approximation to X. The tuning parameter K can be either chosen by cross-validation or AIC/BIC. This property is useful for data visualization when the data is high dimensional.

Matrix factorization

minimize U , D , V { 1 2 X - U D V T F 2 } s u b j e c t t o U T U = I , V T V = I , D d i a g +

This gives the interpretation of pattern recognition. The first column of U gives the first major pattern in sample (row) space while the first column of V gives the first major pattern in feature space. This property is also useful in recommender systems (a lot of the popular algorithms in collaborative filtering like SVD++, bias-SVD etc. are based upon this “projection-to-find-major-pattern” idea).


max V K T X T X V K s u b j e c t t o V K T V K = 1 , V K T V j = 0

X T X here behaves like covariates for multivariate Gaussian. This is essentially an eigenvalue problem of covariance: X T X   =   V D 2 V T and X X T   =   U D 2 U T . Interpretation here is that we are maximizing the covariates in column and row space.

_PCA (Figure Credit: https://onlinecourses.science.psu.edu/stat857/node/35)

The intuition behind pca

The intuition behind PCA is as follows: The First PC (Principal Component) finds the linear combinations of variables that correspond to the direction with maximal sample variance (the major pattern of the dataset, the most spread out direction). Succeeding PCs then goes on to find direction that gives highest variance under the constraint of it being orthogonal (uncorrelated) to preceding ones. Geometrically, what we are doing is basically a coordinate transformation – the newly formed axes correspond to the newly constructed linear combination of variables. The number of the newly formed coordinate axes (variables) is usually much lower than the number of axes (variables) in the original dataset, but it’s still explaining most of the variance present in the data.

Another interesting insight

Another interesting insight on PCA is provided by considering its relationship to Ridge Regression (L2 penalty). The result given by Ridge Regression can be written like this:

Y ^ = X β ^ r = j = 1 p u j d j 2 d j 2 + λ u j T y

The term in the middle here, d j 2 d j 2 + λ , shrinks the singular values. For those major patterns with large singular values, lambda has little effect for shrinking; but for those with small singular values, lambda has huge effect to shrink them towards zero (not exactly zero, unlike lasso - L1 penalty, which does feature selection). This non-uniform shrinkage thus has a grouping effect. This is why Ridge Regression is often used when features are strongly correlated (it only captures orthogonal major pattern). PCA is really easy to implement - feed the data matrix(n*p) to the SVD command in Matlab, extract the PC loading(V) and PC score(U) vector and we will get the major pattern we want.

Questions & Answers

show that the set of all natural number form semi group under the composition of addition
Nikhil Reply
what is the meaning
explain and give four Example hyperbolic function
Lukman Reply
⅗ ⅔½
The denominator of a certain fraction is 9 more than the numerator. If 6 is added to both terms of the fraction, the value of the fraction becomes 2/3. Find the original fraction. 2. The sum of the least and greatest of 3 consecutive integers is 60. What are the valu
1. x + 6 2 -------------- = _ x + 9 + 6 3 x + 6 3 ----------- x -- (cross multiply) x + 15 2 3(x + 6) = 2(x + 15) 3x + 18 = 2x + 30 (-2x from both) x + 18 = 30 (-18 from both) x = 12 Test: 12 + 6 18 2 -------------- = --- = --- 12 + 9 + 6 27 3
2. (x) + (x + 2) = 60 2x + 2 = 60 2x = 58 x = 29 29, 30, & 31
on number 2 question How did you got 2x +2
combine like terms. x + x + 2 is same as 2x + 2
Q2 x+(x+2)+(x+4)=60 3x+6=60 3x+6-6=60-6 3x=54 3x/3=54/3 x=18 :. The numbers are 18,20 and 22
Mark and Don are planning to sell each of their marble collections at a garage sale. If Don has 1 more than 3 times the number of marbles Mark has, how many does each boy have to sell if the total number of marbles is 113?
mariel Reply
Mark = x,. Don = 3x + 1 x + 3x + 1 = 113 4x = 112, x = 28 Mark = 28, Don = 85, 28 + 85 = 113
how do I set up the problem?
Harshika Reply
what is a solution set?
find the subring of gaussian integers?
hello, I am happy to help!
Shirley Reply
please can go further on polynomials quadratic
hi mam
I need quadratic equation link to Alpa Beta
Abdullahi Reply
find the value of 2x=32
Felix Reply
divide by 2 on each side of the equal sign to solve for x
Want to review on complex number 1.What are complex number 2.How to solve complex number problems.
yes i wantt to review
use the y -intercept and slope to sketch the graph of the equation y=6x
Only Reply
how do we prove the quadratic formular
Seidu Reply
please help me prove quadratic formula
hello, if you have a question about Algebra 2. I may be able to help. I am an Algebra 2 Teacher
Shirley Reply
thank you help me with how to prove the quadratic equation
may God blessed u for that. Please I want u to help me in sets.
what is math number
Tric Reply
x-2y+3z=-3 2x-y+z=7 -x+3y-z=6
Sidiki Reply
can you teacch how to solve that🙏
Solve for the first variable in one of the equations, then substitute the result into the other equation. Point For: (6111,4111,−411)(6111,4111,-411) Equation Form: x=6111,y=4111,z=−411x=6111,y=4111,z=-411
x=61/11 y=41/11 z=−4/11 x=61/11 y=41/11 z=-4/11
Need help solving this problem (2/7)^-2
Simone Reply
what is the coefficient of -4×
Mehri Reply
A soccer field is a rectangle 130 meters wide and 110 meters long. The coach asks players to run from one corner to the other corner diagonally across. What is that distance, to the nearest tenths place.
Kimberly Reply
Jeannette has $5 and $10 bills in her wallet. The number of fives is three more than six times the number of tens. Let t represent the number of tens. Write an expression for the number of fives.
August Reply
What is the expressiin for seven less than four times the number of nickels
Leonardo Reply
How do i figure this problem out.
how do you translate this in Algebraic Expressions
linda Reply
why surface tension is zero at critical temperature
I think if critical temperature denote high temperature then a liquid stats boils that time the water stats to evaporate so some moles of h2o to up and due to high temp the bonding break they have low density so it can be a reason
Need to simplify the expresin. 3/7 (x+y)-1/7 (x-1)=
Crystal Reply
. After 3 months on a diet, Lisa had lost 12% of her original weight. She lost 21 pounds. What was Lisa's original weight?
Chris Reply
how did you get the value of 2000N.What calculations are needed to arrive at it
Smarajit Reply
Privacy Information Security Software Version 1.1a
Got questions? Join the online conversation and get instant answers!
Jobilize.com Reply

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now

Source:  OpenStax, Elec 301 projects fall 2013. OpenStax CNX. Sep 14, 2014 Download for free at http://legacy.cnx.org/content/col11709/1.1
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Elec 301 projects fall 2013' conversation and receive update notifications?