Principal components

Principal components are the linear combinations of the original variables.


Variances of each principal component show how much of the original variation in the dataset is explained by the principal component.

When the data is standardized, a component with a variance of 1 indicates that the principal component accounts for the variation equivalent to one of the original variables. Also, the sum of all the variances is equal to the number of original variables.


Coefficients are the linear combinations of the original variables that make up the principal component. The coefficients for each principal component can sometimes reveal the structure of the data. Absolute values near zero indicate that a variable contributes little to the component, whereas larger absolute values indicate variables that contribute more to the component.

Often, when the data is centered and standardized, the coefficients are normalized so that the sum of the squares of the coefficients of a component is equal to the variance of the component. In this normalization, the coefficients can be interpreted as the correlation between the original variable and the principal component, and are often called loadings (a term borrowed from factor analysis).


Scores are new variables that are the value of the linear combination of the original variables. The scores are normalized so that the sum of squares equals the variance of the principal component.