# Contingency table

A contingency table, also known as a cross-classification table, describes the relationships between two or more categorical variables.

A table cross-classifying two variables is called a 2-way contingency table and forms a rectangular table with rows for the R categories of the X variable and columns for the C categories of a Y variable. Each intersection is called a cell and represents the possible outcomes. The cells contain the frequency of the joint occurrences of the X, Y outcomes. A contingency table having R rows and C columns is called an R x C table.

A variable having only two categories is called a binary variable. When both variables are binary, the resulting contingency table is a 2 x 2 table. Also, commonly known as a four-fold table because there are four cells.

Smoke |
|||

Alcohol consumption |
Yes | No | Total |

Low | 10 | 80 | 90 |

High | 50 | 40 | 90 |

Total | 60 | 120 | 180 |

A contingency table can summarize three probability distributions – joint, marginal, and conditional.

- The joint distribution describes the proportion of the subjects jointly classified by a category of X and a category of Y. The cells of the contingency table divided by the total provides the joint distribution. The sum of the joint distribution is 1.
- The marginal distributions describe the distribution of the X (row) or Y (column) variable alone. The row and column totals of the contingency table provide the marginal distributions. The sum of a marginal distribution is 1.
- The conditional distributions describe the distribution of one variable given the levels of the other variable. The cells of the contingency table divided by the row or column totals provide the conditional distributions. The sum of a conditional distribution is 1.

When both variables are random, you can describe the data using the joint distribution, the conditional distribution of Y given X, or the conditional distribution of X given Y.

When one variable is and explanatory variable (X, fixed) and the other a response variable (Y, random), the notion of a joint distribution is meaningless, and you should describe the data using the conditional distribution of Y given X. Likewise, if Y is a fixed variable and X random, you should describe the data using the conditional distribution of X given Y.

When the variables are matched-pairs or repeated measurements on the same sampling unit, the table is square R=C, with the same categories on both the rows and columns. For these tables, the cells may exhibit a symmetric pattern about the main diagonal of the table, or the two marginal distributions may differ in some systematic way.

After 6 months |
|||

Before |
Approve | Disapprove | Total |

Approve | 794 | 150 | 944 |

Disapprove | 86 | 570 | 656 |

Total | 880 | 720 | 1600 |

- What is Analyse-it?
- Administrator's Guide
- User's Guide
- Statistical Reference Guide
- Distribution
- Compare groups
- Compare pairs
- Contingency tables
- Contingency table
- Creating a contingency table
- Creating a contingency table (related data)
- Grouped frequency plot
- Effect size
- Estimators
- Estimating the odds ratio
- Estimating the odds ratio (related data)
- Relative risk
- Inferences about equality of proportions
- Inferences about independence
- Mosaic plot
- Creating a mosaic plot
- Study design
- Correlation and association
- Principal component analysis (PCA)
- Factor analysis (FA)
- Item reliability
- Fit model
- Method comparison
- Measurement systems analysis (MSA)
- Reference interval
- Diagnostic performance
- Control charts
- Process capability
- Pareto analysis
- Bibliography

Published 8-Jan-2017

Version 4.90