Section 30 Multiple Linear Regression: Pairs plot


30.1 Statistical Model

\[ \large y_{i} = \beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i} + ... + \beta_p x_{pi} + \epsilon_{i} \]


30.2 Blood pressure data: Correlation between variables

SBP Age Income BMI
SBP 1.0000 0.6500 0.0252 0.8943
Age 0.6500 1.0000 0.0395 0.6085
Income 0.0252 0.0395 1.0000 0.0244
BMI 0.8943 0.6085 0.0244 1.0000

30.3 Blood pressure data: Pairs plot

# The histogram is on the diagonal
# The estimates of correlations are on the upper panels,
# with size proportional to the magnitude of correlations.

panel.hist <- function(x, ...)
{
    usr <- par('usr'); on.exit(par(usr))
    par(usr = c(usr[1:2], 0, 1.5) )
    h <- hist(x, plot = FALSE)
    breaks <- h$breaks; nB <- length(breaks)
    y <- h$counts; y <- y/max(y)
    rect(breaks[-nB], 0, breaks[-1], y, col='cyan', ...)
}

panel.cor <- function(x, y, digits = 2, prefix = "", cex.cor,  ...)
{
    usr <- par("usr"); on.exit(par(usr))
    par(usr = c(0, 1, 0, 1))
    r <- cor(x, y)
    txt <- format(c(r, 0.123456789), digits = digits)[1]
    txt <- paste0(prefix, txt)
    if(missing(cex.cor)) cex.cor <- 0.8/strwidth(txt)
    text(0.5, 0.5, txt, cex = cex.cor * abs(r), col='blue')
}

pairs(BP[,c('SBP','Age','Income','BMI')],
      lower.panel = panel.smooth, 
      diag.panel = panel.hist,
      upper.panel = panel.cor)