Scatter plot

A scatter plot shows the association between two variables. A scatter plot matrix shows all pairwise scatter plots for many variables.

If the variables tend to increase and decrease together, the association is positive. If one variable tends to increase as the other decreases, the association is negative. If there is no pattern, the association is zero.

When a straight line describes the relationship between the variables, the association is linear. When a constantly increasing or decreasing nonlinear function describes the relationship, the association is monotonic. Other relationships may be nonlinear or non-monotonic.

The type of relationship determines the statistical measures and tests of association that are appropriate.

If the association is a linear relationship, a bivariate normal density ellipse summarizes the correlation between variables. The narrower the ellipse, the greater the correlation between the variables. The wider and more round it is, the more the variables are uncorrelated. If the association is nonlinear, it is often worth trying to transform the data to make the relationship linear as there are more statistics for analyzing linear relationships and their interpretation is easier than nonlinear relationships.

An observation that appears detached from the bulk of observations may be an outlier requiring further investigation. An individual observation on each of the variables may be perfectly reasonable on its own but appear as an outlier when plotted on a scatter plot. Outliers can badly affect the product-moment correlation coefficient, whereas other correlation coefficients are more robust to them.