In Chapter 4, we defined the t statistic to be
t=Difference of sample meansStandard error of difference of sample means
then computed its value for the data observed in an experiment. Next, we compared the result with the value tα that defined the most extreme 100α percent of the possible values to t that would occur (in both tails) if the two samples were drawn from a single population. If the observed value of t exceeded tα (given in Table 4-1), we reported a “statistically significant” difference, with P <>α As Figure 4-4 showed, the distribution of possible values of t has a mean of zero and is symmetric about zero.
On the other hand, if the two samples are drawn from populations with different means, the distribution of values of t associated with all possible experiments involving two samples of a given size is not centered on zero; it does not follow the t distribution. As Figures 6-3 and 6-5 showed, the actual distribution of possible values of t has a nonzero mean that depends on the size of the treatment effect. It is possible to revise the definition of t so that it will be distributed according to the t distribution in Figure 4-4 regardless of whether or not the treatment actually has an effect. This modified definition of t is
t= Difference of sample means − true difference in population meansStandard error of difference of sample means
Notice that if the hypothesis of no treatment effect is correct, the difference in population means is zero and this definition of t reduces to the one we used before. The equivalent mathematical statement is
t=(X¯1−X¯2)−(μ1−μ2)sX¯1−X¯2
In Chapter 4 we computed t from the observations, then compared it with the critical value for a “big” value of t with ν = n1 + n2 − 2 degrees of freedom to obtain a P value. Now, however, we cannot follow this approach since we do not know all the terms on the right side of the equation. Specifically, we do not know the true difference in mean values of the two populations from which the samples were drawn, μ1 − μ2. We can, however, use this equation to estimate the size of the treatment effect, μ1 − μ2.
Instead of using the equation to determine t, we will select an appropriate value of t and use the equation to estimate μ1 − μ2. The only problem is that of selecting an appropriate value for t.
By definition, 100α percent of all possible values of t are more negative than −tα or more positive than +tα. For example, only 5% of all possible t values will fall outside the interval between −t.05 and +t.05, where t.05 is the critical value of t that defines the most extreme 5% of the t distribution (tabulated in Table 4-1). Therefore, 100(1 − α) percent of all possible values of t fall between −tα and +tα. For example, 95% of all possible values of t will fall between −t.05 and +t.05.
Every different pair of random samples we draw in our experiment will be associated with different values of, X¯1 − X¯2 and sx¯1−x¯2 and 100(1 − α) percent of all possible experiments involving samples of a given size will yield values of t that fall between −tα and +tα. Therefore, for 100(1 − α) percent of all possible experiments
−tα (X¯1−X¯2)−(μ1−μ2)sX¯1−X¯2 +tα
Solve this equation for the true difference in sample means
(X¯1−X¯2)−tα sX¯1−X¯2 μ1−μ2 (X¯1−X¯2)+tα sX¯1−X¯2
In other words, the actual difference of the means of the two populations from which the samples were drawn will fall within ta standard errors of the difference of the sample means of the observed difference in the sample means. (ta has ν = n1 + n2 − 2 degrees of freedom, just as when we used the t distribution in hypothesis testing). This range is called the 100(1 − α) percent confidence interval for the difference of the means. For example, the 95% confidence interval for the true difference of the population means is
(X¯1−X¯2)−t.05 sX¯1−X¯2 μ1−μ2 (X¯1−X¯2)+t.05 sX¯1−X¯2
This equation defines the range that will include the true difference in the means for 95% of all possible experiments that involve drawing samples from the two populations under study.
Since this procedure to compute the confidence interval for the difference of two means uses the t distribution, it is subject to the same limitations as the t test. In particular, the samples must be drawn from populations that follow a normal distribution at least approximately.