<< Chapter < Page | Chapter >> Page > |
In general, without changing the sample size or the type of the test of the hypothesis, a decrease in $\alpha $ causes an increase in $\beta $ , and a decrease in $\beta $ causes an increase in $\alpha $ . Both probabilities $\alpha $ and $\beta $ of the two types of errors can be decreased only by increasing the sample size or, in some way, constructing a better test of the hypothesis.
If n =100 and we desire a test with significance level $\alpha $ =0.05, then $\alpha =P\left(\overline{X}\ge c;\mu =60\right)=0.05$ means, since $\overline{X}$ is $\text{N(}\mu \text{,100/100=1)}$ ,
$$P\left(\frac{\overline{X}-60}{1}\ge \frac{c-60}{1};\mu =60\right)=0.05$$ and $c-60=1.645$ . Thus c =61.645. The power function is
$$K\left(\mu \right)=P\left(\overline{X}\ge 61.645;\mu \right)=P\left(\frac{\overline{X}-\mu}{1}\ge \frac{61.645-\mu}{1};\mu \right)=1-\Phi \left(61.645-\mu \right).$$
In particular, this means that $\beta $ at $\mu $ =65 is $$=1-K\left(\mu \right)=\Phi \left(61.645-65\right)=\Phi \left(-3.355\right)\approx 0;$$ so, with n =100, both $\alpha $ and $\beta $ have decreased from their respective original values of 0.1587 and 0.0668 when n =25. Rather than guess at the value of n , an ideal power function determines the sample size. Let us use a critical region of the form $\overline{x}\ge c$ . Further, suppose that we want $\alpha $ =0.025 and, when $\mu $ =65, $\beta $ =0.05. Thus, since $\overline{X}$ is $\text{N(}\mu \text{,100/n)}$ ,
$$0.025=P\left(\overline{X}\ge c;\mu =60\right)=1-\Phi \left(\frac{c-60}{10/\sqrt{n}}\right)$$ and $$0.05=1-P\left(\overline{X}\ge c;\mu =65\right)=\Phi \left(\frac{c-65}{10/\sqrt{n}}\right).$$
That is, $\frac{c-60}{10/\sqrt{n}}=1.96$ and $\frac{c-65}{10/\sqrt{n}}=-1.645$ .
Solving these equations simultaneously for c and $10/\sqrt{n}$ , we obtain $$c=60+1.96\frac{5}{3.605}=62.718;$$ $$\frac{10}{\sqrt{n}}=\frac{5}{3.605}.$$
Thus, $\sqrt{n}=7.21$ and $n=51.98$ . Since n must be an integer, we would use n =52 and obtain $\alpha $ =0.025 and $\beta $ =0.05, approximately.
For a number of years there has been another value associated with a statistical test, and most statistical computer programs automatically print this out; it is called the probability value or, for brevity, p -value . The p -value associated with a test is the probability that we obtain the observed value of the test statistic or a value that is more extreme in the direction of the alternative hypothesis, calculated when ${\text{H}}_{\text{0}}$ is true. Rather than select the critical region ahead of time, the p -value of a test can be reported and the reader then makes a decision.
Say we are testing ${\text{H}}_{\text{0}}\text{:}\mu \text{=60}$ against ${\text{H}}_{\text{1}}\text{:}\mu \text{60}$ with a sample mean $\overline{X}$ based on n =52 observations. Suppose that we obtain the observed sample mean of $\overline{x}=62.75$ . If we compute the probability of obtaining an $\overline{x}$ of that value of 62.75 or greater when $\mu $ =60, then we obtain the p -value associated with $\overline{x}=62.75$ . That is,
$$\begin{array}{l}p-value=P\left(\overline{X}\ge 62.75;\mu =60\right)=P\left(\frac{\overline{X}-60}{10/\sqrt{52}}\ge \frac{62.75-60}{10/\sqrt{52}};\mu =60\right)\\ =1-\Phi \left(\frac{62.75-60}{10/\sqrt{52}}\right)=1-\Phi \left(1.983\right)=\mathrm{0.0237.}\end{array}$$
If this p -value is small, we tend to reject the hypothesis ${\text{H}}_{\text{0}}\text{:}\mu \text{=60}$ . For example, rejection of ${\text{H}}_{\text{0}}\text{:}\mu \text{=60}$ if the p -value is less than or equal to 0.025 is exactly the same as rejection if $\overline{x}=62.718$ .That is, $\overline{x}=62.718$ has a p -value of 0.025. To help keep the definition of p -value in mind, we note that it can be thought of as that tail-end probability , under ${\text{H}}_{\text{0}}$ , of the distribution of the statistic, here $\overline{X}$ , beyond the observed value of the statistic. See Figure 1 for the p -value associated with $\overline{x}=\mathrm{62.75.}$
Suppose that in the past, a golfer’s scores have been (approximately) normally distributed with mean $\mu $ =90 and ${\sigma}^{2}$ =9. After taking some lessons, the golfer has reason to believe that the mean $\mu $ has decreased. (We assume that ${\sigma}^{2}$ is still about 9.) To test the null hypothesis ${\text{H}}_{\text{0}}\text{:}\mu \text{=90}$ against the alternative hypothesis ${\text{H}}_{\text{1}}\text{:}\mu \text{90}$ , the golfer plays 16 games, computing the sample mean $\overline{x}$ .If $\overline{x}$ is small, say $\overline{x}\le c$ , then ${H}_{0}$ is rejected and ${H}_{1}$ accepted; that is, it seems as if the mean $\mu $ has actually decreased after the lessons. If c =88.5, then the power function of the test is
$$K\left(\mu \right)=P\left(\overline{X}\le 88.5;\mu \right)=P\left(\frac{\overline{X}-\mu}{3/4}\le \frac{88.5-\mu}{3/4};\mu \right)=\Phi \left(\frac{88.5-\mu}{3/4}\right).$$
Because 9/16 is the variance of $\overline{X}$ . In particular, $$\alpha =K\left(90\right)=\Phi \left(-2\right)=1-0.9772=\mathrm{0.0228.}$$
If, in fact, the true mean is equal to $\mu $ =88 after the lessons, the power is $K\left(88\right)=\Phi \left(2/3\right)=0.7475$ . If $\mu $ =87, then $K\left(87\right)=\Phi \left(2\right)=0.9772$ . An observed sample mean of $\overline{x}=88.25$ has a
$$p-value=P\left(\overline{X}\le 88.25;\mu =90\right)=\Phi \left(\frac{88.25-90}{3/4}\right)=\Phi \left(-\frac{7}{3}\right)=0.0098,$$
and this would lead to a rejection at $\alpha $ =0.0228 (or even $\alpha $ =0.01).
Notification Switch
Would you like to follow the 'Introduction to statistics' conversation and receive update notifications?