Correlation / PCA tutorial
Learn how to visualize the relationships between the variables and the similarities between observations.
To illustrate the concepts, we will use data from a New York magazine article that examines the most livable neighborhoods in New York. In the original article, written by Nate Silver, neighborhoods were scored using 12 factors. Then, the scores for each factor were combined into an overall score and ranking for each neighborhood. For more information, see NY magazine story “ The Most Livable Neighborhoods in New York”and Junk Charts story “ The scatter plot matrix: a great tool”.
If you prefer you can watch a video of this tutorial.
In this tutorial you will perform the following tasks:
- Understanding the relationship between variables
- Reducing the dimensionality of the data
- Understanding the relationship between variables (revisited)
- Understanding the similarities between observations
- Grouping the observations
- Adding additional variables
- Adding additional observations
- Publishing the plot