We are receiving a lot of questions about relevant analyses in the Analyse-it Method Validation edition to help in evaluating new diagnostic tests in the fight against COVID-19. Below are some quick links that will help, but contact us if you have questions - we are working as normal.
Also see our latest blog post: Sensitivity/Specificity and The Importance of Predictive Values for a COVID-19 test
Parameter estimates (also called coefficients) are the change in the response associated with a one-unit change of the predictor, all other predictors being held constant.
The unknown model parameters are estimated using least-squares estimation.
A coefficient describes the size of the contribution of that predictor; a near-zero coefficient indicates that variable has little influence on the response. The sign of the coefficient indicates the direction of the relationship, although the sign can change if more terms are added to the model, so the interpretation is not particularly useful. A confidence interval expresses the uncertainty in the estimate, under the assumption of normally distributed errors. Due to the central limit
theorem, violation of the normality assumption is not a problem if the sample size is moderate.
For example, a coefficient for Height of 0.75, in a simple model for the response Weight (kg) with predictor Height (cm), could be expressed as 0.75 kg per cm which indicates a 0.75 kg weight increase per 1 cm in
When a predictor is a logarithm transformation of the original variable, the coefficient is the rate of change in the response per 1 unit change in the log of the predictor. Commonly base 2 log and base 10 log are used as transforms. For base 2 log the coefficient can be interpreted as the rate of change in the response when for a doubling of the predictor value. For base 10 log the coefficient can be interpreted as the rate of change in the
response when the predictor is multiplied by 10, or as the % change in the response per % change in the predictor.
Analyse-it uses effect coding for nominal terms (also known as the mean deviation coding). The sum of the parameter estimates for a categorical term using effect coding is equal to 0.
Analyse-it uses reference coding for ordinal terms. The first level is used as the baseline or reference level.
A standardized parameter estimate (commonly known as standardized beta coefficient) removes the unit of measurement of predictor and response variables. They represent the change in standard deviations of the response for 1 standard deviation change of the predictor. You can use them to compare the relative effects of predictors measured on different scales.
VIF, the variance inflation factor, represents the increase in the variance of the parameter estimate due to correlation (collinearity) between predictors. Collinearity between the predictors can lead to unstable parameter estimates. As a rule of thumb, VIF should be close to the minimum value of 1, indicating no collinearity. When VIF is greater than 5, there is high collinearity between predictors.
A t-test formally tests the null hypothesis that the parameter is equal to 0, against the alternative hypothesis that it is not equal to 0. When the p-value is small, you can reject the null hypothesis and conclude that the parameter is not equal to 0 and it does contribute to the model.
When a parameter is not deemed to contribute statistically to the model, you can consider removing it. However, you should be cautious of removing terms that are known to contribute by some underlying mechanism, regardless of the statistical significance of a hypothesis test, and recognize that removing a term can alter the effect of other terms.