Section 38 MLR - Interaction of Continuous & Categorical variables: Model
38.1 Statistical Model
\[ \large y_{i} = \beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i} + \beta_3 x_{1i}x_{2i} + \epsilon_{i} \]
\[ \large fm \leftarrow lm(SBP \sim BMI + DM + BMI:DM, \space data=BP) \]
\[ \large anova(fm) \]
Df | Sum Sq | Mean Sq | F value | Pr(>F) | |
---|---|---|---|---|---|
BMI | 1 | 15103.0386 | 15103.0386 | 5883.969 | 0 |
DM | 1 | 2168.0726 | 2168.0726 | 844.656 | 0 |
BMI:DM | 1 | 340.9495 | 340.9495 | 132.830 | 0 |
Residuals | 496 | 1273.1384 | 2.5668 | NA | NA |
Total | 499 | 18885.1991 | NA | NA | NA |
38.2 Estimates: Effects
\[ \large summary(fm) \]
Estimate | Std. Error | t value | Pr(>|t|) | |
---|---|---|---|---|
(Intercept) | 45.0607 | 1.2267 | 36.7343 | 0 |
BMI | 2.1929 | 0.0491 | 44.6911 | 0 |
DM2 | -15.6132 | 1.7223 | -9.0655 | 0 |
BMI:DM2 | 0.7917 | 0.0687 | 11.5252 | 0 |
38.3 Estimates: Effects with centered variables
BP <- read.csv('data/BP.csv')
BP$cBMI <- scale(BP$BMI, center=mean(BP$BMI), scale=1)
fm <- lm(SBP ~ cBMI + DM + cBMI:DM, data=BP)
\[ \large summary(fm) \]
Estimate | Std. Error | t value | Pr(>|t|) | |
---|---|---|---|---|
(Intercept) | 95.6879 | 0.2270 | 421.5444 | 0 |
cBMI | 1.4012 | 0.1093 | 12.8229 | 0 |
DM | 4.1688 | 0.1434 | 29.0730 | 0 |
cBMI:DM | 0.7917 | 0.0687 | 11.5252 | 0 |
38.4 Plot
Note the difference in slopes due to BMI for Non-Diabetic and Diabetic conditions. The model outcomes also suggest that the difference in slopes for two levels of Diabetic conditions is statistically significant. At this stage, however, we need to investigate the model further along with other predictors. The green dashed vertical lines shows the mean BMI of the population.