Section 44 Statistical Power

Figure: A simple representation of the distributions of Null (\(H_0\)) and Alternative (\(H_1\)) Hypotheses to test one sample mean from the hypothesised population mean

Type I error (False Positive) and Type II error (False Negative)
The Statistical Power of any test is defined as the probability of making the correct decision (e.g. detecting a difference) when the assumption being tested (e.g. no difference) is false.
The Statistical Power is equivalent to the probability of NOT making Type II error (False Negative)
Or, the Statistical Power is the probability of rejecting the null hypothesis \(\large H_0\) when the alternative hypothesis \(\large H_A\) is true.
If we repeat an experiment many times, the Statistical Power is the expected proportion of experiments for which:
- the P-value < 0.05
- The 95% confidence interval for the parameter is sufficiently narrow
Statistical Power depends on
- Sample size
- Variability in the measured outcome(s)
- Biologically significant difference expected from the study
- Experimental design
- Acceptable type I and II error

Post-hoc power calculation, i.e. calculation of satistical power of an experiemnt after the experiment is conducted is NOT correct.

Components of Statistical Power and Sample Size calculation of a test:
- Sample size
- Mean difference
- Variability of measured outcomes
- Level of significance: Type I error (\(\large \alpha\))
- Power of test: 1- Type II error (\(\large 1 - \beta\))

44.1 Example

We wish to compare treatment A with treatment B, i.e. compare their means, where a two-sample t-test is appropriate.
The standard deviation for both treatments is approximately 4.
Level of significance: Type I error (\(\large \alpha\)): p = 0.05, that is the probability of concluding there is a difference when in fact there is no difference between A and B, is 0.05.
Mean difference: \(\large \delta\) = 5, i.e. we wish to detect a true difference between means of A and B as 5 unit
Power of test: 1- Type II error (\(\large 1 - \beta\)) = 0.80, i.e. we want the Power of the test to be 0.80.

power.t.test(delta = 5, sd = 4, sig.level = 0.05, power = 0.80)


     Two-sample t test power calculation 

              n = 11.09423
          delta = 5
             sd = 4
      sig.level = 0.05
          power = 0.8
    alternative = two.sided

NOTE: n is number in *each* group

Conclusion

We need n = 11 samples for each treatment (22 in total) to achieve Power of 0.80 detecting a difference of 5 units between the means of A and B.