Section 30 Multiple Linear Regression: Pairs plot
30.1 Statistical Model
\[ \large y_{i} = \beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i} + ... + \beta_p x_{pi} + \epsilon_{i} \]
30.2 Blood pressure data: Correlation between variables
SBP | Age | Income | BMI | |
---|---|---|---|---|
SBP | 1.0000 | 0.6500 | 0.0252 | 0.8943 |
Age | 0.6500 | 1.0000 | 0.0395 | 0.6085 |
Income | 0.0252 | 0.0395 | 1.0000 | 0.0244 |
BMI | 0.8943 | 0.6085 | 0.0244 | 1.0000 |
30.3 Blood pressure data: Pairs plot
# The histogram is on the diagonal
# The estimates of correlations are on the upper panels,
# with size proportional to the magnitude of correlations.
panel.hist <- function(x, ...)
{
usr <- par('usr'); on.exit(par(usr))
par(usr = c(usr[1:2], 0, 1.5) )
h <- hist(x, plot = FALSE)
breaks <- h$breaks; nB <- length(breaks)
y <- h$counts; y <- y/max(y)
rect(breaks[-nB], 0, breaks[-1], y, col='cyan', ...)
}
panel.cor <- function(x, y, digits = 2, prefix = "", cex.cor, ...)
{
usr <- par("usr"); on.exit(par(usr))
par(usr = c(0, 1, 0, 1))
r <- cor(x, y)
txt <- format(c(r, 0.123456789), digits = digits)[1]
txt <- paste0(prefix, txt)
if(missing(cex.cor)) cex.cor <- 0.8/strwidth(txt)
text(0.5, 0.5, txt, cex = cex.cor * abs(r), col='blue')
}
pairs(BP[,c('SBP','Age','Income','BMI')],
lower.panel = panel.smooth,
diag.panel = panel.hist,
upper.panel = panel.cor)