Section 19 Hypothesis Testing - Two Samples: Steps
19.1 Model
SBP = Overall Mean + Sampling Variability
\[ \large y_{j} = \mu + \epsilon_{j} \]
SBP = Overall mean + Group effect + Sampling variability
\[ \large y_{ij} = \mu + \beta_{i} + \epsilon_{ij} \]
\(\large y_{ij}\) = j -th observation (replicate) in the i -th treatment
\(\large \mu\) = overall mean effect
\(\large \beta_{i}\) = effect of treatment group i
\(\large \epsilon_{ij} \sim NID(0, \sigma^2)\)
\(\large i\) = treatment index; i: 1, 2
\(\large j\) = observation index within each treatment; j: 1 to n
19.2 Assumptions
Values within each group are independent and normally distributed
Variances of the two groups are equal
19.3 Steps of Hypothesis testing
Identify the parameter of interest
Define \(\large H_O\) and \(\large H_A\)
Define a significance level \(\large \alpha\)
Calculate an estimate of the parameter
Determine an appropriate test statistic, its distribution when \(\large H_O\) is correct, calculate the value of test statistic from the sample
Obtain the probability under the distribution of the test statistic
Compare the observed probability given \(\large \alpha\) and conclude
19.4 Two Samples, Unknown Variance
Inference for a difference in means of two Normal distributions when variances are unknown
19.5 Steps of Hypothesis testing: Details
- Identify the parameter of interest: Population mean \(\large \mu\)
- Define \(\large H_O\) and \(\large H_A\)
\[\large H_O: \mu_1 = \mu_2\]
\[\large H_A: \mu_1 \ne \mu_2\]
- Define \(\large \alpha\)
\[\large \alpha = 0.05\]
- Calculate an estimate of the parameter
Sample Mean: \[ \large \bar{x_1} = \frac{1}{n}\sum\limits_{i=1}^{n} x_{1i} \] \[ \large \bar{x_2} = \frac{1}{n}\sum\limits_{i=1}^{n} x_{2i} \]
Pooled Sample Standard Deviation:
Scenario 1: Variances of TWO samples are equal
\[ \large s = \sqrt{\frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1+n_2-2}}\]
Scenario 2: Variances of TWO samples are NOT equal
\[ \large s = \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}\]
Where: \(\large s_1^2\) and \(\large s_2^2\) are variances from Sample 1 and Sample 2, respectively.
- Determine test statistic, its distribution when \(\large H_O\) is correct, calculate the value of test statistic from the sample.
\[ \large t_{Cal} = \frac{(\bar{x_1}-\bar{x_2}) - (\mu_1 - \mu_2)} {s\sqrt{1/n_1+1/n_2}} \]
\[ \large t_{Cal} = \frac{(\bar{x_1}-\bar{x_2})} {s\sqrt{1/n_1+1/n_2}} \]
- Note - The test statistic is the difference of Observed Difference & Expected Difference - The test statistic represents the ratio of signal to error - The test statistic is centred and scaled
Distribution of the test statistic
Scenario 1: Variances of TWO samples are equal
\[\large t \hspace{2mm} distribution \hspace{2mm} with \hspace{2mm} (n_1+n_2-2) \hspace{2mm} df \]
Scenario 2: Variances of TWO samples are NOT equal
The degrees of freedom will be computed.
\[\large degrees \hspace{2mm} of \hspace{2mm} freedom = \frac{(s_1^2 + s_2^2)} {s_1^2/(n_1-1) + s_2^2/(n_2-1)} \]
\[\large t \hspace{2mm} distribution \hspace{2mm} with \hspace{2mm} computed \hspace{2mm} df \]
- Obtain the probability under the distribution of the test statistic (two-tailed probability)
\[\large 2*pt(q = |t_{Cal}|, \space df, \space lower.tail=FALSE)\]
- Compare the observed probability given the \(\large \alpha\) and conclude
Find the probability that \(\large t\) is less than or equal to \(\large |t_{Cal}|\) from \(\large t\) distribution with \(\large (n_1+n_2-2)\) degrees of freedom for the Scenario 1. Use integer values of degrees of freedom for the Scenario 2.