# Residual plot

A residual plot shows the difference between the observed response and the fitted response values.

The ideal residual plot, called the null residual plot, shows a random scatter of points forming an approximately constant width band around the identity line.

It is important to check the fit of the model and assumptions – constant variance, normality, and independence of the errors, using the residual plot, along with normal, sequence, and lag plot.

Assumption | How to check |
---|---|

Model function is linear | The points form a pattern when the model function is incorrect. You might be able to transform variables or add polynomial and interaction terms to remove the pattern. |

Constant variance | If the points tend to form an increasing, decreasing or non-constant width band, then the variance is not constant. You should consider transforming the response variable or incorporating weights into the model. When variance increases as a percentage of the response, you can use a log transform, although you should ensure it does not produce a poorly fitting model. Even with non-constant variance, the parameter estimates remain unbiased if somewhat inefficient. However, the hypothesis tests and confidence intervals are inaccurate. |

Normality | Examine the normal plot of the residuals to identify non-normality. Violation of the normality assumption only becomes an issue with small sample sizes. For large sample sizes, the assumption is less important due to the central limit theorem, and the fact that the F- and t-tests used for hypothesis tests and forming confidence intervals are quite robust to modest departures from normality. |

Independence | When the order of the cases in the dataset is the order in which they occurred: Examine a sequence plot of the residuals against the order to identify any dependency between the residual and time. Examine a lag-1 plot of each residual against the previous residual to identify a serial correlation, where observations are not independent, and there is a correlation between an observation and the previous observation. Time-series analysis may be more suitable to model data where serial correlation is present. |

For a model with many terms, it can be difficult to identify specific problems using the residual plot. A non-null residual plot indicates that there are problems with the model, but not necessarily what these are.

**Available in Analyse-it Editions**

Standard edition

Method Validation edition

Quality Control & Improvement edition

Ultimate edition

- What is Analyse-it?
- Administrator's Guide
- User's Guide
- Statistical Reference Guide
- Distribution
- Compare groups
- Compare pairs
- Contingency tables
- Correlation and association
- Principal component analysis (PCA)
- Factor analysis (FA)
- Item reliability
- Fit model
- Linear fit
- Simple regression models
- Fitting a simple linear regression
- Advanced models
- Fitting a multiple linear regression
- Performing ANOVA
- Performing 2-way or higher factorial ANOVA
- Performing ANCOVA
- Fitting an advanced linear model
- Scatter plot
- Summary of fit
- Parameter estimates
- Effect of model hypothesis test
- ANOVA table
- Predicted against actual Y plot
- Lack of Fit
- Effect of terms hypothesis test
- Effect leverage plot
- Effect means
- Plotting main effects and interactions
- Multiple comparisons
- Multiple comparison procedures
- Comparing effect means
- Residual plot
- Residuals - normality
- Residuals - independence
- Plotting residuals
- Outlier and influence plot
- Identifying outliers and other influential points
- Prediction
- Making predictions
- Saving variables
- Logistic fit
- Study design
- Method comparison
- Measurement systems analysis (MSA)
- Reference interval
- Diagnostic performance
- Control charts
- Process capability
- Pareto analysis
- Study Designs
- Bibliography

Version 5.40

Published 29-Jul-2019