+ - 0:00:00
Notes for current slide
Notes for next slide

Principle Component Analysis(PCA)

Lin Yu

2023-03-20

1 / 20

Background

## [1] "correlation between X1 and X2: 0.9"

2 / 20

Intuition

fitting a p-dimensional ellipsoid to the data by stretching and rotating the coordination system.

3 / 20

Background

## [1] "correlation between X1 and X2: 0.9"

4 / 20

Which is better?

5 / 20

Goal

maximize |d1|+|d2|+|d3|+...+|dn|, equivalent to maximize d12+d22+d32+...+dn2

Qeustion: how to represent d?

d1=(a1b1).(e11e12)

where a1, b1 is the coordinate of the measured data point;
e11 and e12 are the element of the basis vector e1.

6 / 20

working example

7 / 20

(cont'd)

suppose we have two data points, the goal can be written as:

d12+d22=(a1e11+b1e12)2+(a2e11+b2e12)2

=(e11e12).(a12+a22a1b1+a2b2a1b1+a2b2b12+b12).(e11e12) =eT.UΣUT.e=(eT.U)Σ(UT.e)=nTΣn (known as SVD)

=nT(σ100σ2)n=(n1n2)(σ100σ2)(n1n2)

=n12×σ1+n22×σ2 => where n12+n22=1 and σ1>σ2 when n1=0, n2=0, d12+d22 max

(UT.e)=n,e=Un,where n is (1,0)T

8 / 20

=eT.UΣUT.e=(eT.U)Σ(UT.e)=nTΣn

9 / 20

Steps:

  1. Center the values of each variable in the dataset on 0 (by subtracting the mean of the variable's observed values from each of those values)

  2. Compute the covariance matrix of the data and calculate the eigenvalues and corresponding eigenvectors of this covariance matrix

  3. Normalize each of the orthogonal eigenvectors to turn them into unit vectors

10 / 20

Visualization

11 / 20

The scree plot

(help choose how many PCs to retain)
12 / 20

The profile plot

  • shows the correlations between each PC and the original variables
  • will become very crowded and hard to read when have many variables
13 / 20

The score plot

projection of data onto the span of PC1 and PC2

14 / 20

The loadings plots

shows the relationship between the PCs and the original variables

15 / 20

The biplot

overlays a score plot and a loadings plot in a single graph

  • The cosine of the angle between a vector and an axis indicates the importance of the contribution of the corresponding variable to the principal component.

  • The cosine of the angle between pairs of vectors indicates correlation between the corresponding variables. Highly correlated variables point in similar directions;

  • Points that are close to each other in the biplot represent observations with similar values.

16 / 20

Application

17 / 20

1. Image compression

## (156, 194, 3)

## (156, 194, 3)

18 / 20

2. Face recognition

Download pcaface.pdf

19 / 20

Background

## [1] "correlation between X1 and X2: 0.9"

2 / 20
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
oTile View: Overview of Slides
sToggle scribble toolbox
Alt + fFit Slides to Screen
Esc Back to slideshow