Section 29 ANOVA Table: Simple R implementation

29.1 Analysis of Variance Table

\[ \large fm \leftarrow lm(SBP \sim Group, \space data=BP) \]

\[ \large anova(fm) \]

	Df	Sum Sq	Mean Sq	F value	Pr(>F)
Group	3	1521.638	507.2128	7.4983	5e-04
Residuals	36	2435.184	67.6440	NA	NA
Total	39	3956.822	NA	NA	NA

29.2 Explanation

Degrees of freedom (df)

\(\large n\) = observations per group

\(\large g\) = number of groups

Group df = \(\large (g - 1)\)

Residual df = \(\large g * (n - 1)\)

Total df = Treatment df + Residual df = \(\large (g * n) - 1\)

g <- nlevels(BP$Group)

n <- nrow(BP)/g

df.g <- g - 1

df.error <- g * (n - 1)

df.total <- g * n - 1

Sum of Squares due to Treatment (Group)

For each group calculate: n * (Group mean - overall mean)²
Add the values for the different groups together

\[ \large SST = n\sum\limits_{i=1}^{g} (\bar{y_i}-\bar{y})^2 \]

y.bar <- mean(BP$SBP, na.rm = TRUE)

yi.bar <- tapply(BP$SBP, INDEX = BP$Group, FUN = mean, na.rm = TRUE)

SST <- n * sum((yi.bar - y.bar)^2)

Sum of Squares due to Error

Residual Sum of Squares

For each observation calculate:

(Observed value - group mean)²
Add the values for the different observations together

\[ \large SSE = \sum\limits_{i=1}^{g} \sum\limits_{j=1}^{n} (y_{ij}-\bar{y_i})^2 \]

Total Sum of Squares = SS due to Treatment + SS due to Error

yij <- BP$SBP

TSS <- sum((yij - y.bar)^2)

SSE <- TSS - SST

Mean Squares

Mean square = Sum of squares / degrees of freedom

\(\large MS = SS / df\)

MST <- SST/df.g

MSE <- SSE/df.error

MS <- TSS/df.total

F-value (Variance Ratio)

F value = Treatment MS / Residual MS

Pr(>F)

P-value: the probability of obtaining a variance ratio this large under the null hypothesis that the treatment means are all equal.

Under the null hypothesis the variance ratio has an F distribution.

F.stat <- MST/MSE

pF <- pf(q = F.stat, df1 = df.g, df2 = df.error, lower.tail = FALSE)


DF <- data.frame(df = c(df.g, df.error, df.total), SS = c(SST, SSE, TSS), MS = c(MST, MSE, MS), Fstat = c(F.stat,
    NA, NA), Prob = c(pF, NA, NA))

row.names(DF) <- c("Group", "Error", "Total")

DF

anova(fm)