Section 26 LM: ANOVA

26.1 Analysis of Variance Table


\[ \large fm \leftarrow lm(SBP \sim BMI, \space data=BP) \]

\[ \large anova(fm) \]


Df Sum Sq Mean Sq F value Pr(>F)
BMI 1 15103.04 15103.0386 1988.629 0
Residuals 498 3782.16 7.5947 NA NA
Total 499 18885.20 NA NA NA

26.2 Explanation


Degrees of freedom (df)

\(\large n\) = Total number of observations

Regression df = BMI df = \(\large 1\)

Residual df = \(\large n - 1 - 1\)

Total df = Regression df + Residual df = \(\large n - 1\)


Total Sum of Squares (TSS)

\[ \large TSS = \sum\limits_{i=1}^{n} (y_i-\bar y)^2 = S_{yy}\]


Sum of Squares due to Regression (SSb)

\[ \large SSb = \hat\beta_1\sum\limits_{i=1}^{n} (x_i-\bar x)(y_i-\bar y) = \hat\beta_1S_{xy}\]


Residual Sum of Squares (RSS)

\[ \large RSS = TSS - SSb = S_{yy} - \hat\beta_1S_{xy} \]



Mean Squares

Mean square = Sum of squares / degrees of freedom

\(\large MS = SS / df\)


F-value (Variance Ratio)

F value = Regression MS / Residual MS


Pr(>F)

P-value: the probability of obtaining a variance ratio this large under the null hypothesis that the coefficient equals to zero.

Under the null hypothesis the variance ratio has an F distribution.


Error Variance = Residual Mean Square

\(\large \hat\sigma^2 = Residual \space MS \space = MSE\)