# Summary statistics, Box/Dot/Mean plots

This procedure is available in both the Analyse-it Standard and the Analyse-it Method Evaluation edition

Summary statistics present a statistical and visual overview of independent samples. A side-by-side combined dot-, box-, mean-, percentile- and SD-plot give a visual summary and statistics such as the mean, median, standard deviation, percentiles, quartiles, skewness and kurtosis summarise the sample numerically.

The requirements of the test are:

- At least two independent samples measured on a continuous scale.

- Arranging the dataset
- Using the test
- Customising the frequency histogram
- Examining the observations with a dot plot
- Customising the box and percentile plots
- Customising the mean and SD plots
- Assessing normality
- References to further reading

## Arranging the dataset

Data in existing Excel worksheets can be used and should be arranged in a List dataset layout or Table dataset layout. The dataset must contain a continuous scale variable and a nominal/ordinal scale variable containing two or more independent groups.

When entering new data we recommend using New Dataset to create a new 2 variables (1 categorical) dataset ready for data entry.

## Using the test

To start the summary statistics test:

- Excel 2007:

Select any cell in the range containing the dataset to analyse, then click**Compare Groups**on the**Analyse-it**tab, then click**Summary**. - Tick
**Variables**to compare. - Tick
**Parametric - Mean, SD, SE**to show parametric statistics. - Tick
**Non-parametric - Median, Percentiles**to show non-parametric statistics. - Click
**OK**to run the test.

Excel 97, 2000, 2002 & 2003:

Select any cell in the range containing the dataset to analyse, then click **Analyse **on the **Analyse-it **toolbar, click **Compare Groups** then click **Summary**.

The report shows box and mean plots and summary statistics for the sample. Summary statistics include the number of observations analysed, the mean, median, standard deviation, standard error, min, max and interquartile range (IQR).

Confidence intervals are calculated for the mean and median. The interval shows, for the chosen level of certainty, the range of the true underlying population mean or median.

To change the confidence interval calculated:

- If the Summary statistics dialog box is not visible click
**Edit**on the**Analyse-it**tab/toolbar. - Enter
**Confidence interval**to calculate around the mean & median. The level should be entered as a percentage, between 50 and 100, without the % sign. - Click
**OK**.

## Examining the observations with a dot plot

Dot plots show individual observations to allow visual assessment of the distribution and clustering of observations, and to spot possible outliers or data entry errors. Observations are plotted on the X axis against a random value on the Y axis to minimise overlapping points.

To show dot plots on the univariate plot:

- If the Summary statistics dialog box is not visible click
**Edit**on the**Analyse-it**tab/toolbar. - Tick
**Dot plots** - Click
**OK**.

## Customising the box and percentile plots

**Box **and percentile plots show the non-parametric central tendency, dispersion and distribution shape of the sample. Box plot styles vary between publications, with the most common styles mainly differing in how the whiskers are drawn. The box plot style determines how the whiskers are shown:

**Outlier**box plots show whiskers extending to the furthest observations within ±1.5 IQR (interquartile ranges) of the 1st or 3rd quartile. Observations outside 1.5 IQRs are marked as near outliers , and those outside 3.0 IQRs are marked as far outliers (see below).**Skeletal**box plots show whiskers extending to the minimum and maximum observations (see below).

The box-plot can be shown as a rectangular box, or notched to show the confidence interval of the median.

show a simple rectangular box-plot, from the first to the third quartile, with the median marked in the centre (see below).plots**Basic**boxshow a basic box plot as above, with the addition of a notched (pinched or indented) section for the confidence interval around the median (see below).**Notched**box plots

To change the box plot:

- If the Summary statistics dialog box is not visible click
**Edit**on the**Analyse-it**tab/toolbar. **Click****Box plot**then select**Skeletal**or**Outlier**- Click
**Style**then select**Basic**,**Notched**, or**Notched / Basic**. Notched / Basic shows a notched box plot when the median is within the range of the quartiles, otherwise reverts to a basic box plot to avoid an ugly plot with the median notch extending beyond the quartiles. - Click
**OK**.

To hide box plots:

- If the Summary statistics dialog box is not visible click
**Edit**on the**Analyse-it**tab/toolbar. - Click
**Box plot**then select**None** - Click
**OK**.

The range between two percentiles can be shown on the box or dot plots (the percentile values are also shown in the percentile table). The range can show where 80%, 90%, 95% or 99% of the observations of the sample lie.

To change the percentiles shown:

- If the Summary statistics dialog box is not visible click
**Edit**on the**Analyse-it**tab/toolbar. - Click
**Percentile plot**then select**None, 80% of distribution, 90% of distribution, 95% of distribution**or**99% of distribution.** - Click
**OK**.

## Customising the mean and SD plots

**Mean **and SD plots show the parametric central tendency, dispersion and distribution shape.

The mean plot shows the mean as a vertical line, and optionally, the confidence interval for the mean as a diamond shape.

To change the mean plot:

- If the Summary statistics dialog box is not visible click
**Edit**on the**Analyse-it**tab/toolbar. **Click****Mean plot**then select**Mean line**or**Mean + CI diamond**.- Click
**OK**.

SD plots are similar to non-parametric percentile plots, but show the parametric dispersion of the sample. They are useful for assessing the symmetry and skew of the distribution, and can show the dispersion of mean ± 1, 2 or 3 standard deviations or a range from the normal distribution.

To change the SD plot:

- If the Summary statistics dialog box is not visible click
**Edit**on the**Analyse-it**tab/toolbar. **Click**r**Std Deviation plot**then select**±1 SD**,**±2 SD**,**±3 SD**o**80%**,**90%**,**95%**or**99% of distribution**.- Click
**OK**.

To hide the mean and/or SD plot:

- If the Summary statistics dialog box is not visible click
**Edit**on the**Analyse-it**tab/toolbar. - Click
**Mean plot**then select**None**. **Click**.**Std Deviation plot**then select**None**- Click
**OK**.

## References to further reading

- Handbook of Parametric and Nonparametric Statistical Procedures (3rd edition)

David J. Sheskin, ISBN 1-58488-440-1 2003. - Some Implementations of the Boxplot

Michael Frigge, David C. Hoaglin, Boris Iglewicz, The American Statistician Vol 41 No. 1 1989; 50-55.