# Summary of fit

R² and similar statistics measure how much variability is explained by the model.

R² is the proportion of variability in the response explained by the model. It is 1 when the model fits the data perfectly, though it can only attain this value when all sets of predictors are different. Zero indicates the model fits no better than the mean of the response. You should not use R² when the model does not include a constant term, as the interpretation is undefined.

For models with more than a single term, R² can be deceptive as it increases as you add more parameters to the model, eventually reaching saturation at 1 when the number of parameters equals the number of observations. Adjusted R² is a modification of R² that adjusts for the number of parameters in the model. It only increases when the terms added to the model improve the fit more than would be expected by chance. It is preferred when building and comparing models with a different number of parameters.

For example, if you fit a straight-line model, and then add a quadratic term to the model, the value of R² increases. If you continued to add more the polynomial terms until there are as many parameters as the number of observations, then the R² value would be 1. The adjusted R² statistic is designed to take into account the number of parameters in the model and ensures that adding the new term has some useful purpose rather than simply due to the number of parameters approaching saturation.

In cases where each set of predictor values are not unique, it may be impossible for the R² statistic to reach 1. A statistic called the maximum attainable R² indicates the maximum value that R² can achieve even if the model fitted perfectly. It is related to the pure error discussed in the lack of fit test.

The root mean square error (RMSE) of the fit, is an estimate of the standard deviation of the true unknown random error (it is the square root of the residual mean square). If the model fitted is not the correct model, the estimate is larger than the true random error, as it includes the error due to lack of fit of the model as well as the random errors.

**Available in Analyse-it Editions**

Standard edition

Method Validation edition

Quality Control & Improvement edition

Ultimate edition

- What is Analyse-it?
- What's new?
- Administrator's Guide
- User's Guide
- Statistical Reference Guide
- Distribution
- Compare groups
- Compare pairs
- Contingency tables
- Correlation and association
- Principal component analysis (PCA)
- Factor analysis (FA)
- Item reliability
- Fit model
- Linear fit
- Simple regression models
- Fitting a simple linear regression
- Advanced models
- Fitting a multiple linear regression
- Performing ANOVA
- Performing 2-way or higher factorial ANOVA
- Performing ANCOVA
- Fitting an advanced linear model
- Scatter plot
- Summary of fit
- Parameter estimates
- Effect of model hypothesis test
- ANOVA table
- Predicted against actual Y plot
- Lack of Fit
- Effect of terms hypothesis test
- Effect leverage plot
- Effect means
- Plotting main effects and interactions
- Multiple comparisons
- Multiple comparison procedures
- Comparing effect means
- Residual plot
- Residuals - normality
- Residuals - independence
- Plotting residuals
- Outlier and influence plot
- Identifying outliers and other influential points
- Prediction
- Making predictions
- Making inverse predictions
- Saving variables
- Logistic / Probit fit
- Study design
- Method comparison / Agreement
- Measurement systems analysis (MSA)
- Reference interval
- Diagnostic performance
- Survival/Reliability
- Control charts
- Process capability
- Pareto analysis
- Study Designs
- Bibliography

Version 6.15

Published 18-Apr-2023