Analyse-it
Analyse-it
  • Products
  • Pricing
  • Support
  • About us
Download trial Sign in
  • Tutorials
  • Correlation / PCA tutorial

Understanding the similarities between observations

Until now, we have been interested in understanding the relationships between the variables, but often the interest is on the similarity between neighborhoods or groups of neighborhoods. Whilst it is possible to label and color the points on the scatter plots relating to neighborhoods, it is not easy to interpret them when each neighborhood is represented on 60 or more plots. It is easier to first reduce the dimensionality of the data using principal components, and then use a biplot that simultaneously plots information on the observations and the variables.

The classical biplot popularized by Gabriel represents the variables using vectors and observations as points whereas a more recent innovation developed by Gower & Hand represents the variables using calibrated axes allowing the observations represented as points to be projected onto the axes and an approximation made. A full monograph titled Understanding Biplots by Gower, Gardener-Lubbe and LeRoux is an excellent book to learn more about biplots.

  1. Open the file tutorials\New York Neighborhoods.xlsx.
  2. Click a cell in the dataset.
  3. On the Analyse-it ribbon tab, in the Statistical Analyses group, click Multivariate, click Biplot, and then click PCA Biplot.
    The analysis task pane opens.
  4. In the Y variables list box, select Affordability, Transit, Shopping & Services, Crime, Food, Schools, Diversity, Creative, Housing Quality, Green Space, Wellness, Nightlife.
  5. Select Label points.
  6. Click Calculate.
    The results are calculated and the analysis report opens.

The biplot shows the two-dimensional approximation to the original multidimensional space. It represents 70% of the original variation in the data. Each point on the biplot represents a neighborhood and each axis represents a variable.

PCA biplot

The distance between points represents the similarity between them, points close to each other are neighborhoods with similar profiles, and points far away have dissimilar profiles.

Any point on the plot can be projected orthogonally onto the axes to show the approximate value of that variable. For example, Bedford Park (center right of the plot) scores around 90 on affordability, 65 on housing quality, and 70 on food. The true values were 89, 60, and 62 respectively, so the approximation is fairly accurate for these variables and this neighborhood.

Next topic: Grouping the observations

  •  Tutorials
  •  Distribution tutorial
  •  Correlation / PCA tutorial
  •  Understanding the relationship between variables
  •  Reducing the dimensionality of the data
  •  Understanding the relationship between variables (revisited)
  •  Understanding the similarities between observations
  •  Grouping the observations
  •  Adding additional variables
  •  Adding additional observations
  •  Publishing the plot
  •  Compare groups means tutorial
  •  Association in 2-way contingency tables tutorial
  •  Simple linear regression tutorial
  •  Bland-Altman method comparison tutorial
  •  Estimating the precision of a measurement procedure (CLSI EP05-A3)
  •  Evaluating the linearity of a measurement procedure (CLSI EP06-A)
  •  Verifying the precision of a measurement procedure against a performance claim and estimating the bias (CLSI EP15-A3)
  •  Pareto charts tutorial
  •  Process control charts tutorial
  •  Process capability tutorial



Version 6.15
Published 18-Apr-2023
Products
  • Standard Edition
  • Medical Edition
  • Quality Control & Improvement Edition
  • Method Validation Edition
  • Ultimate Edition
  • Compare Editions
  • Pricing
Support
  • Documentation
  • Tutorials
  • Download latest version
  • Release history
  • Contact support
Company
  • About us
  • Blog
  • Contact us
  • Privacy policy

Get Started

  • Download free trial
  • Sign In

© 2026 Analyse-it® Software, Ltd. All rights reserved.

Statistical analysis and method validation software for Microsoft Excel.

We use essential cookies to run the site, and optional analytics to improve the experience for visitors. For more information see our Privacy policy.