You are viewing documentation for the old version 2.30 of Analyse-it. If you are using version 3.00 or later we recommend you go to the Chi-square test.

Chi-square test

Pearson Chi-square test is a non-parametric test for a difference in proportions between two or more independent samples. It can also be used to test if the two categorical variables are associated.

The requirements of the test are:

  • Two or more independent samples measured on a nominal or ordinal scale.
    or
    Two variables measured on a nominal or ordinal scale.
  • No more than 20% of the expected counts in the contingency table are < 5.


Arranging the dataset

Data in existing Excel worksheets can be used and should be arranged in a List dataset layout or Table dataset layout containing two nominal or ordinal scale variables. If only a summary of the number of subjects for each combination of groups is available (contingency table) then a 2-way table dataset containing counts can be used.

When entering new data we recommend using New Dataset to create a new 2 variables (categorical) dataset or R x C contingency table ready for data entry.

Using the test

To start the test:

  1. Excel 2007:
    Select any cell in the range containing the dataset to analyse, then click Compare Groups on the Analyse-it tab, then clickPearson Chi-square
  2. Excel 97, 2000, 2002 & 2003:
    Select any cell in the range containing the dataset to analyse, then click Analyse on the Analyse-it toolbar, click Compare Groups then click Pearson Chi-square.

  3. Click Factor A and Factor B and select the variables to compare.
  4. Click OK to run the test.

The report shows the number of observations analysed, and, if applicable, how many missing values were excluded.

The number of observations cross-classified by the two factors is shown as a contingency table. Beneath each count, in brackets, is the expected count. No more than 20% of the expected cell counts should be < 5 otherwise groups should be combined to increase the expected counts.

The Pearson statistic and hypothesis test are shown. The p-value is the probability of rejecting the null hypothesis, that the samples have the different proportions or that the samples are independent, when it is in fact true. A significant p-value implies that at least two samples have different proportions or there is an association between the variables.

METHOD  Yates' correction for continuity is not applied when the contingency table is 2 x 2 (see [2] page 195).

Further reading & references

  1. Handbook of Parametric and Nonparametric Statistical Procedures (3rd edition)
    David J. Sheskin, ISBN 1-58488-440-1 2003; 219.
  2. Practical Non-parametric Statistics (3rd edition)
    Conover W.J. ISBN 0-471-16068-7 1999; 180-215.

(click to enlarge)