Section 20 Multiple Linear Regression (MLR): lm implementation

20.1 Statistical Model

20.1.1 Null Model

\[ \large y_{i} = \beta_0 + \epsilon_{i} \]


20.1.2 Regression Model

\[ \large y_{i} = a + \beta_1 x_{1i} + \beta_2 x_{2i} + \epsilon_{i} \]


20.2 Syntax

20.2.1 Null Model

\[ \large fm \leftarrow lm(SBP \sim 1, \space data=BP) \]


20.2.2 Regression Model

\[ \large fm \leftarrow lm(SBP \sim 1 + BMI + Age, \space data=BP) \]

\[ \large fm \leftarrow lm(SBP \sim BMI + Age, \space data=BP) \]


20.3 Assumptions

  • \(y\) is related to \(x\) by the simple linear regression model:

\[ \large y_{i} = a + \beta_1 x_{1i} + \beta_2 x_{2i} + \epsilon_{i}, \space i=1,...,n\] \[ \large E(y | X_1=x_{1i}, X_2=x_{2i}) = \hat\beta_0 + \hat\beta_1x_{1i} + \hat\beta_2x_{2i} \]

  • The errors \(\epsilon_1, \epsilon_2, ..., \epsilon_n\) are independent of each other.

  • The errors \(\epsilon_1, \epsilon_2, ..., \epsilon_n\) have a common variance \(\sigma^2\).

  • The errors are normally distributed with a mean of 0 and variance \(\sigma^2\), that is:

\[ \large \epsilon \sim N(0,\sigma^2) \]


20.4 Hypothesis

Intercept

\[ \large H_O: \beta_0 = 0 \] \[ \large H_A: \beta_0 \ne 0 \]


Regression coefficients for k-th predictors

\[ \large H_O: \beta_k = 0 \]

\[ \large H_A: \beta_k \ne 0 \]


20.5 Investigating fitted lm object

\[ \large anova(fm) \]

\[ \large summary(fm) \]