Section 38 MLR - Interaction of Continuous & Categorical variables: Model


38.1 Statistical Model

\[ \large y_{i} = \beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i} + \beta_3 x_{1i}x_{2i} + \epsilon_{i} \]

\[ \large fm \leftarrow lm(SBP \sim BMI + DM + BMI:DM, \space data=BP) \]

\[ \large anova(fm) \]

Df Sum Sq Mean Sq F value Pr(>F)
BMI 1 15103.0386 15103.0386 5883.969 0
DM 1 2168.0726 2168.0726 844.656 0
BMI:DM 1 340.9495 340.9495 132.830 0
Residuals 496 1273.1384 2.5668 NA NA
Total 499 18885.1991 NA NA NA


38.2 Estimates: Effects

\[ \large summary(fm) \]

Estimate Std. Error t value Pr(>|t|)
(Intercept) 45.0607 1.2267 36.7343 0
BMI 2.1929 0.0491 44.6911 0
DM2 -15.6132 1.7223 -9.0655 0
BMI:DM2 0.7917 0.0687 11.5252 0


38.3 Estimates: Effects with centered variables

BP <- read.csv('data/BP.csv')

BP$cBMI <- scale(BP$BMI, center=mean(BP$BMI), scale=1)

fm <- lm(SBP ~ cBMI + DM + cBMI:DM, data=BP)

\[ \large summary(fm) \]

Estimate Std. Error t value Pr(>|t|)
(Intercept) 95.6879 0.2270 421.5444 0
cBMI 1.4012 0.1093 12.8229 0
DM 4.1688 0.1434 29.0730 0
cBMI:DM 0.7917 0.0687 11.5252 0


38.4 Plot

Note the difference in slopes due to BMI for Non-Diabetic and Diabetic conditions. The model outcomes also suggest that the difference in slopes for two levels of Diabetic conditions is statistically significant. At this stage, however, we need to investigate the model further along with other predictors. The green dashed vertical lines shows the mean BMI of the population.