# Chi-square test

Pearson Chi-square test is a non-parametric test for a difference in proportions between two or more independent samples. It can also be used to test if the two categorical variables are associated.

The requirements of the test are:

- Two or more independent samples measured on a nominal or ordinal scale.
*or*

Two variables measured on a nominal or ordinal scale. - No more than 20% of the expected counts in the contingency table are < 5.

## Arranging the dataset

Data in existing Excel worksheets can be used and should be arranged in a List dataset layout or Table dataset layout containing two nominal or ordinal scale variables. If only a summary of the number of subjects for each combination of groups is available (contingency table) then a 2-way table dataset containing counts can be used.

When entering new data we recommend using New Dataset to create a new **2 variables (categorical)** dataset or **R x C contingency table** ready for data entry.

**Using the test**

To start the test:

- Excel 2007:

Select any cell in the range containing the dataset to analyse, then click**Compare Groups**on the**Analyse-it**tab, then click**Pearson Chi-square** - Click
**Factor A**and**Factor B**and select the variables to compare. - Click
**OK**to run the test.

Excel 97, 2000, 2002 & 2003:

Select any cell in the range containing the dataset to analyse, then click **Analyse **on the **Analyse-it **toolbar, click **Compare Groups** then click **Pearson Chi-square**.

The report shows the number of observations analysed, and, if applicable, how many missing values were excluded.

The number of observations cross-classified by the two factors is shown as a contingency table. Beneath each count, in brackets, is the expected count. No more than 20% of the expected cell counts should be < 5 otherwise groups should be combined to increase the expected counts.

The Pearson *X²* statistic and hypothesis test are shown. The *p*-value is the probability of rejecting the null hypothesis, that the samples have the different proportions or that the samples are independent, when it is in fact true. A significant p-value implies that at least two samples have different proportions or there is an association between the variables.

** METHOD ** Yates' correction for continuity is **not** applied when the contingency table is 2 x 2 (see [2] page 195).

## Further reading & references

- Handbook of Parametric and Nonparametric Statistical Procedures (3rd edition)

David J. Sheskin, ISBN 1-58488-440-1 2003; 219. - Practical Non-parametric Statistics (3rd edition)

Conover W.J. ISBN 0-471-16068-7 1999; 180-215.

- Welcome
- Getting started
- What's new in this version
- Installing Analyse-it
- Starting Analyse-it
- Defining Datasets
- Setting Variable properties
- Running a statistical test
- Working with analysis reports
- Analyse-it Standard edition
- Describe
- Compare groups
- Summary statistics, Box/Dot/Mean plots
- Test Difference in Location
- Test Difference in Dispersion
- Test Difference in Proportion
- Fisher exact
- Chi-square test
- Compare pairs
- Correlation
- Agreement
- Regression
- Analyse-it Method Evaluation edition
- Citing Analyse-it
- Contact us
- About us

Published -

Version