Section 27 MLR: Model Diagnostics

Multiple Linear Regression: Model Diagnostics


Estimates of Residuals

\[ \large \hat\epsilon_{i} = y_i - \hat y_{i} \]


27.1 Plots

  • Properties of residual as \(\large \epsilon_{ij} \sim NID(0, \sigma^2)\)

  • Various diagnostic tools are available in order to check these assumptions:

    - Histogram of residuals: should be approximately $\sim NID(0, \sigma^2)$
    
    - QQ plot should show the distribution of residual as Normal
    
    - Residuals versus fitted values
    
    - Residuals versus explanatory variables
    
    - Residuals versus any other variables not included in the model

27.2 Standard residual plots using lm object

plot(fm)

27.3 Customised residual plots

27.3.1 Calculate predicted values and residuals

pred <- predict(object = fm, data=BP)

res <- fm$residuals


27.3.2 Histogram of residuals

pred <- predict(object = fm, data=BP)

res <- fm$residuals

hist(x=res, freq = TRUE, 
     main = '', 
     xlab = 'Residuals', 
     ylab = 'Frequency', 
     axes = TRUE, 
     col = 'lightblue', 
     lty = 1, border = 'purple')

hist(x=res, breaks=10, freq = FALSE, 
     xlab = 'Residuals', 
     ylab = 'Density', 
     axes = TRUE, 
     col = 'lightblue', 
     lty = 1, border = 'purple')
lines(density(res), col = 'red', lwd = 2, lty = 1)

27.3.3 QQ plot of residuals

qqnorm(y=res, main='Normal QQ of residuals',
     xlab='Theoretical Quantiles',
     ylab='Residuals',
     col='blue', pch=20)
qqline(y=res, lty=1, lwd=2, col='red')


27.3.4 Scatter plot of residual against fitted value

plot(x=pred, y=res, pch=20, col='purple')
# lo <- loess(res ~ pred)
# lines(predict(lo), col='red', lty=1, lwd=2)
abline(a = 0, b = 0, lty=2, col='red')


27.3.5 Scatter plot of residual against BMI

plot(x=BP$BMI, y=res, pch=20, col='purple')
# lo <- loess(res ~ pred)
# lines(predict(lo), col='red', lty=1, lwd=2)
abline(a = 0, b = 0, lty=2, col='red')


27.3.6 Scatter plot of residual against Age

plot(x=BP$Age, y=res, pch=20, col='purple')
# lo <- loess(res ~ pred)
# lines(predict(lo), col='red', lty=1, lwd=2)
abline(a = 0, b = 0, lty=2, col='red')


27.3.7 Scatter plot of residual against Income

plot(x=BP$Income, y=res, pch=20, col='purple')
# lo <- loess(res ~ pred)
# lines(predict(lo), col='red', lty=1, lwd=2)
abline(a = 0, b = 0, lty=2, col='red')