Univariate descriptive statistics

Descriptive statistics provide information about the central location (central tendency), dispersion (variability or spread), and shape of the distribution.

There are many measures of central location, dispersion, and shape:

Statistic	Purpose
N	The number of non-missing values in a set of data.
Sum	The sum of the values in a set of data.
Mean	Measure the central tendency using the arithmetic mean.
Harmonic mean	Measure the central tendency using the reciprocal of the arithmetic mean of the reciprocals. Useful when the values are rates and ratios.
Geometric mean	Measure the central tendency using the product of the values. Useful when the values are percentages.
Variance	Measure the amount of variation using the squared deviation of a variable from its mean.
Standard deviation	Measure the amount of variation using the square root of the variance.
CV% / RSD	Measure the spread relative to its expected value (standard deviation divided by the mean). Also known as the coefficient of variation or relative standard deviation.
Skewness	Measure the "sideness" or symmetry of the distribution. Skewness can be positive or negative. Negative skew indicates that the tail on the left side of the distribution is longer or fatter than the right side. Positive skew indicates the converse, the tail on the right side is longer or fatter than the left side. A value of zero indicates the tails on both sides balance; this is the case for symmetric distributions, but also for asymmetric distributions where a short fat tail balances out a long thin tail. Uses the Fisher-Pearson standardized moment coefficient G₁ definition of sample skewness.
Kurtosis	Measure the "tailedness" of the distribution. That is, whether the tails are heavy or light. Excess kurtosis is the kurtosis minus 3 and provides a comparison to the normal distribution. Positive excess kurtosis (called leptokurtic) indicates the distribution has fatter tails than a normal distribution. Negative excess kurtosis (called platykurtic) indicates the distribution has thinner tails than a normal distribution. Uses the G₂ definition of the sample excess kurtosis
Median	Measure the central tendency using the middle value in a set of data.
Minimum	Smallest value in a set of data.
Maximum	Largest value in a set of data.
Range	The difference between the maximum and minimum.
1st quartile	The middle value between the smallest value and median in a set of data.
3rd quartile	The middle value between the median and largest value in a set of data.
Interquartile range	Measure the spread between the 1st and 3rd quartile.
Mode	The value that appears the most in a set of data.
Quantiles	A set of values that divide the range of the distribution into contiguous intervals defined by probabilities.

For normally distributed data, the mean and standard deviation provide the best measures of central location and dispersion.

For data with a non-normal or highly-skewed distribution, or data with extreme values, the median and the first and third quartiles provide better measures of central location and dispersion. When the distribution of the data is symmetric, the inter-quartile range (IQR) is a useful measure of dispersion. Quantiles further describe the distribution of the data, providing an interval containing a specified proportion (for example, 95%) of the data or by breaking the data into intervals each containing a proportion of the data (for example, deciles each containing 10% of the data).

Calculating univariate descriptive statistics, by group

Available in Analyse-it Editions
Standard edition
Method Validation edition
Quality Control & Improvement edition
Ultimate edition