Principal component analysis (PCA)

Principal component analysis (PCA) reduces the dimensionality of a dataset with a large number of interrelated variables while retaining as much of the variation in the dataset as possible.

PCA is a mathematical technique that reduces dimensionality by creating a new set of variables called principal components. The first principal component is a linear combination of the original variables and explains as much variation as possible in the original data. Each subsequent component explains as much of the remaining variation as possible under the condition that it is uncorrelated with the previous components.

The first few principal components provide a simpler picture of the data than trying to understand all the original variables. Sometimes, it is desirable to try and name and interpret the principal components, a process call reification, although this should not be confused with the purpose of factor analysis.