Section 44 Statistical Power



Figure: A simple representation of the distributions of Null (\(H_0\)) and Alternative (\(H_1\)) Hypotheses to test one sample mean from the hypothesised population mean


  • Type I error (False Positive) and Type II error (False Negative)

  • The Statistical Power of any test is defined as the probability of making the correct decision (e.g. detecting a difference) when the assumption being tested (e.g. no difference) is false.

  • The Statistical Power is equivalent to the probability of NOT making Type II error (False Negative)

  • Or, the Statistical Power is the probability of rejecting the null hypothesis \(\large H_0\) when the alternative hypothesis \(\large H_A\) is true.

  • If we repeat an experiment many times, the Statistical Power is the expected proportion of experiments for which:
    • the P-value < 0.05
    • The 95% confidence interval for the parameter is sufficiently narrow
  • Statistical Power depends on
    • Sample size
    • Variability in the measured outcome(s)
    • Biologically significant difference expected from the study
    • Experimental design
    • Acceptable type I and II error


  • Post-hoc power calculation, i.e. calculation of satistical power of an experiemnt after the experiment is conducted is NOT correct.


  • Components of Statistical Power and Sample Size calculation of a test:
    • Sample size
    • Mean difference
    • Variability of measured outcomes
    • Level of significance: Type I error (\(\large \alpha\))
    • Power of test: 1- Type II error (\(\large 1 - \beta\))

44.1 Example

  • We wish to compare treatment A with treatment B, i.e. compare their means, where a two-sample t-test is appropriate.

  • The standard deviation for both treatments is approximately 4.

  • Level of significance: Type I error (\(\large \alpha\)): p = 0.05, that is the probability of concluding there is a difference when in fact there is no difference between A and B, is 0.05.

  • Mean difference: \(\large \delta\) = 5, i.e. we wish to detect a true difference between means of A and B as 5 unit

  • Power of test: 1- Type II error (\(\large 1 - \beta\)) = 0.80, i.e. we want the Power of the test to be 0.80.


power.t.test(delta = 5, sd = 4, sig.level = 0.05, power = 0.80)

     Two-sample t test power calculation 

              n = 11.09423
          delta = 5
             sd = 4
      sig.level = 0.05
          power = 0.8
    alternative = two.sided

NOTE: n is number in *each* group


Conclusion

We need n = 11 samples for each treatment (22 in total) to achieve Power of 0.80 detecting a difference of 5 units between the means of A and B.