Section 27 MLR: Model Diagnostics
Multiple Linear Regression: Model Diagnostics
Estimates of Residuals
\[ \large \hat\epsilon_{i} = y_i - \hat y_{i} \]
27.1 Plots
Properties of residual as \(\large \epsilon_{ij} \sim NID(0, \sigma^2)\)
Various diagnostic tools are available in order to check these assumptions:
- Histogram of residuals: should be approximately $\sim NID(0, \sigma^2)$ - QQ plot should show the distribution of residual as Normal - Residuals versus fitted values - Residuals versus explanatory variables - Residuals versus any other variables not included in the model
27.3 Customised residual plots
27.3.2 Histogram of residuals
pred <- predict(object = fm, data=BP)
res <- fm$residuals
hist(x=res, freq = TRUE,
main = '',
xlab = 'Residuals',
ylab = 'Frequency',
axes = TRUE,
col = 'lightblue',
lty = 1, border = 'purple')
hist(x=res, breaks=10, freq = FALSE,
xlab = 'Residuals',
ylab = 'Density',
axes = TRUE,
col = 'lightblue',
lty = 1, border = 'purple')
lines(density(res), col = 'red', lwd = 2, lty = 1)
27.3.3 QQ plot of residuals
qqnorm(y=res, main='Normal QQ of residuals',
xlab='Theoretical Quantiles',
ylab='Residuals',
col='blue', pch=20)
qqline(y=res, lty=1, lwd=2, col='red')
27.3.4 Scatter plot of residual against fitted value
plot(x=pred, y=res, pch=20, col='purple')
# lo <- loess(res ~ pred)
# lines(predict(lo), col='red', lty=1, lwd=2)
abline(a = 0, b = 0, lty=2, col='red')
27.3.5 Scatter plot of residual against BMI
plot(x=BP$BMI, y=res, pch=20, col='purple')
# lo <- loess(res ~ pred)
# lines(predict(lo), col='red', lty=1, lwd=2)
abline(a = 0, b = 0, lty=2, col='red')