# Pearson correlation in Microsoft Excel

Pearson correlation is a test to determine the degree of correlation (association) between two variables.

The requirements of the test are:

- Two variables measured on a continuous scale.
- Variables are from a population with a bivariate normal distribution.

## Arranging the dataset

Data in existing Excel worksheets can be used and should be arranged in a List dataset layout. The dataset must contain two continuous scale variables.

When entering new data we recommend using New Dataset to create a new **2 variables** dataset ready for data entry.

**Using t**he test

**Using t**

To start the test:

- Excel 2007:

Select any cell in the range containing the dataset to analyse, then click**Correlation**on the**Analyse-it**tab, then click**Pearson**. - Click
**Variable X**and**Variable Y**and select the variables. - Click
**Alternative hypothesis**and select the alternative hypothesis to test. - Enter
**Confidence interval**to calculate around the Pearson*r*statistic. The level should be entered as a percentage between 50 and 100, without the % sign. - Click
**OK**to run the test.

Excel 97, 2000, 2002 & 2003:

Select any cell in the range containing the dataset to analyse, then click **Analyse **on the **Analyse-it **toolbar, click **Correlation** then click **Pearson**.

r ≠ 0 to test if the variables are correlated. |

r > 0 to test if the variables are positively correlated, where observations of the variables tend to increase together. |

r < 0 to test if the variables are negatively correlated, where observations of one variable tend to increase as observations in the other variable decrease. |

The report shows the number of observations analysed, and, if applicable, how many missing cases were pairwise deleted.

The Pearson r correlation statistic and confidence interval are shown.

** METHOD ** The confidence interval is calculated using the Fisher's Normal transformation (see [1] or [2]).

The hypothesis test is shown. The *p*-value is the probability of rejecting the null hypothesis, that the variables are independent, when it is in fact true. A significant p-value implies that the two variables are correlated.

** METHOD ** The p-value is calculated using the *t* approximation (see [1]).

The scatter plot (see below) shows a visual assessment of the strength of association.

## Further reading & references

- Handbook of Parametric and Nonparametric Statistical Procedures (3rd edition)

David J. Sheskin, ISBN 1-58488-440-1 2003; 945. - Statistics with Confidence (2nd edition)

Gardner M.J., Altman D.G. ISBN 0-7279-1375-1 2000; 89-90.

- Welcome
- Getting started
- What's new in this version
- Installing Analyse-it
- Starting Analyse-it
- Defining Datasets
- Setting Variable properties
- Running a statistical test
- Working with analysis reports
- Analyse-it Standard edition
- Describe
- Compare groups
- Compare pairs
- Correlation
- Pearson correlation
- Spearman correlation
- Kendall correlation
- Agreement
- Regression
- Analyse-it Method Evaluation edition
- Citing Analyse-it
- Contact us
- About us

Version 2.30

Published 9-Jun-2009