Principal component analysis is a quantitatively rigorous method for achieving this simplification. The method generates a new set of variables, called *principal components*. Each principal component is a linear combination of the original variables. All the principal components are orthogonal to each other, so there is no redundant information. The principal components as a whole form an orthogonal basis for the space of the data.

# Principal component analysis : Introduction

PCA is an orthogonal linear transformation that transforms the data to a new coordinate system such that greatest variance by any projection of the data comes to lie on the rst coordinate; the second greatest variance comes up in the second coordinate, and so on. Eigenfaces also known as Principal Components Analysis (PCA) find the minimum mean squared error linear subspace that maps from the original N dimensional data space into an M-dimensional feature space. By doing this, Eigenfaces (where typically M *<< *N) achieve dimensionality reduction by using the M eigenvectors of the covariance matrix corresponding to the largest eigenvalues. The resulting basis vectors are obtained by finding the optimal basis vectors that maximize the total variance of the projected data (i.e. the set of basis vectors that best describe the data). Usually the mean x is extracted from the data, so that PCA is equivalent to Karhunen-Loeve Transform (KLT).

The **first principal** component is a single axis in space. When you project each observation on that axis, the resulting values form a new variable. And the variance of this variable is the maximum among all possible choices of the first axis.

The **second principal** component is another axis in space, perpendicular to the first. Projecting the observations on this axis generates another new variable. The variance of this variable is the maximum among all possible choices of this second axis.

The full set of principal components is as large as the original set of variables. But it is commonplace for the sum of the variances of the first few principal components to exceed 80% of the total variance of the original data. By examining plots of these few new variables, researchers often develop a deeper understanding of the driving forces that generated the original data.

### Principal component analysis Numerical Example

A numerical example may clarify the mechanics of principal component analysis

**Sample data set:** Let us analyze the following 3-variate dataset with 10 observations. Each observation consists of 3 measurements on a wafer: thickness, horizontal displacement, and vertical displacement.

### Compute the correlation matrix:

### Solve for the roots of R:

**Notice that:**

- Each eigenvalue satisfies \( |R-λI|= 0 \)
- The sum of the \( Eigenvalues=3=p \) which is equal to the trace of RR (i.e., the sum of the main diagonal elements
- The determinant of RRis the product of the eigenvalues.
- The product is \( λ_1 Xλ_2 Xλ_3=0.499 \)

**Compute the first column of the ****V matrix:** Substituting the first eigenvalue of 1.769 and RR in the appropriate equation we obtain

This is the matrix expression for three homogeneous equations with three unknowns and yields the first column of VV: 0.64 0.69 -0.34 (again, a computerized solution is indispensable).

Compute the remaining columns of the V matrix

Notice that if you multiply VV by its transpose, the result is an identity matrix, V′V=I.

**Compute the \( L^{1/2} \)**** matrix:**

Now form the matrix ** \( L^{1/2} \)**, which is a diagonal matrix whose elements are the square roots of the eigenvalues of RR. Then obtain SS, the factor structure, using S=** \( VL^{1/2} \)**

So, for example, 0.91 is the correlation between the second variable and the first principal component.

**Compute the communality: **Next compute the communality, using the first two eigenvalues only

Diagonal elements report how much of the variability is explained: Communality consists of the diagonal elements

This means that the first two principal components “explain” 86.62 % of the first variable, 84.20 % of the second variable, and 98.76 % of the third.

**Compute the coefficient matrix: **The coefficient matrix, BB, is formed using the reciprocals of the diagonals of **\( L^{1/2} \)**

**Compute the principal factors: **Finally, we can compute the factor scores from ZB, where Z is X converted to standard score form. These columns are the principal factors

**Principal factors control chart: **These factors can be plotted against the indices, which could be times. If time is used, the resulting plot is an example of a principal factors control chart

**References**

[1] Matthew Turk and Alex Pentland, “Eigenfaces for Recognition”, Volume 3, Issue 1, PP. 71 – 86, January 1991

[2] “Principal Component Analysis (PCA)”, available online at: https://in.mathworks.com/help/stats/principal-component-analysis-pca.html?requestedDomain=www.mathworks.com

[3] “Numerical Example”, available online at: http://www.itl.nist.gov/div898/handbook/pmc/section5/pmc552.htm

[4] Smith, Lindsay I., “A tutorial on principal components analysis.” Cornell University, USA 51, no. 52 (2002): 65.

## 4 Comments

Hi i am kavin, its my first time to commenting anywhere, when i read this piece of writing i thought i could also make comment due to this brilliant post.

http://www.praedicor.com/viewtopic.php?f=14&t=363734

Importance of Dimensionality Reduction in Data mining – Learning & Experience

Introduction of Linear Discriminant Analysis (LDA)

Face Recognition