Neyman-pearson criterion
$\max\{{P}_{D}\},\text{such that}{P}_{F}\le $
The maximization is over all decision rules (equivalently, over
all decision regions
${R}_{0}$ ,
${R}_{1}$ ).
Using different terminology, the Neyman-Pearson criterionselects the
most powerful test of size (not exceeding)
$$ .
Fortunately, the above optimization problem has
an explicit solution. This is given by the celebrated
Neyman-Pearson lemma , which we now state. To ease
the exposition, our initial statement of this result onlyapplies to continuous random variables, and places a technical
condition on the densities. A more general statement is givenlater in the module.
Neyman-pearson lemma: initial statement
Consider the test
$${}_{0}:(x, {f}_{0}(x))$$
$${}_{1}:(x, {f}_{1}(x))$$ where
${f}_{i}(x)$ is a density. Define
$(x)=\frac{{f}_{1}(x)}{{f}_{0}(x)}$ , and assume that
$(x)$ satisfies the condition that for each
$\in \mathbb{R}$ ,
$(x)$ takes on the value
$$ with probability
zero under hypothesis
${}_{0}$ . The solution to the optimization problem
in
is given by
$$(x)=\frac{{f}_{1}(x)}{{f}_{0}(x)}\underset{{}_{0}}{\overset{{}_{1}}{}}$$ where
$$ is such that
$${P}_{F}=\int {f}_{0}(x)\,d x=$$ If
$=0$ , then
$$∞ . The optimal test is unique up to a set of
probability zero under
${}_{0}$ and
${}_{1}$ .
The optimal decision rule is called the
likelihood ratio
test .
$(x)$ is the
likelihood ratio , and
$$ is a
threshold . Observe that neither the likelihood
ratio nor the threshold depends on the
a
priori probabilities
$({}_{i})$ . they depend only on the conditional densities
${f}_{i}$ and the size constraint
$$ . The threshold can often
be solved for as a function of
$$ , as the next example
shows.
Continuing with
,
suppose we wish to design a Neyman-Pearson decision rule withsize constraint
$$ . We have
$(x)=\frac{\frac{1}{\sqrt{2\pi}}e^{-\left(\frac{(x-1)^{2}}{2}\right)}}{\frac{1}{\sqrt{2\pi}}e^{-\left(\frac{x^{2}}{2}\right)}}=e^{x-\frac{1}{2}}$
By taking the natural logarithm of both sides of the LRT and
rarranging terms, the decision rule is not changed, and weobtain
$$x\underset{{}_{0}}{\overset{{}_{1}}{}}\ln +\frac{1}{2}\equiv $$ Thus, the optimal rule is in fact a thresholding rule like we
considered in
. The false-alarm
probability was seen to be
$${P}_{F}=Q()$$ Thus, we may express the value of
$$ required by the
Neyman-Pearson lemma in terms of
$$ :
$$=Q()^{(-1)}$$
Sufficient statistics and monotonic transformations
For hypothesis testing involving multiple or
vector-valued data, direct evaluation of the size(
${P}_{F}$ ) and power
(
${P}_{D}$ )
of a Neyman-Pearson decision rule would require integrationover multi-dimensional, and potentially complicated decision
regions. In many cases, however, this can be avoided bysimplifying the LRT to a test of the form
$$t\underset{{}_{0}}{\overset{{}_{1}}{}}$$ where the test statistic
$t=T(x)$ is a
sufficient
statistic for the data. Such a simplified form is
arrived at by modifying both sides of the LRT withmontonically increasing transformations, and by algebraic
simplifications. Since the modifications do not change thedecision rule, we may calculate
${P}_{F}$ and
${P}_{D}$ in terms of the sufficient statistic. For
example, the false-alarm probability may be written
${P}_{F}=(\text{declare}{}_{1})=\int {f}_{0}(t)\,d t$
where
${f}_{0}(t)$ denotes the density of
$t$ under
${}_{0}$ . Since
$t$ is typically of lower dimension than
$x$ , evaluation of
${P}_{F}$ and
${P}_{D}$ can be greatly simplified. The key is being able to reduce theLRT to a threshold test involving a sufficient statistic
for which we know the distribution .
Common variances, uncommon means
Let's design a Neyman-Pearson decision rule
of size
$$ for the
problem
$${}_{0}:(x, (0, ^{2}I))$$
$${}_{1}:(x, (1, ^{2}I))$$ where
$> 0$ ,
$^{2}> 0$ are known,
$0=\left(\begin{array}{c}0\\ \\ 0\end{array}\right)$ ,
$1=\left(\begin{array}{c}1\\ \\ 1\end{array}\right)$ are
$N$ -dimensional
vectors, and
$I$ is the
$N$
$N$ identity
matrix. The likelihood ratio is
$(x)=\frac{\prod_{n=1}^{N} \frac{1}{\sqrt{2\pi ^{2}}}e^{-\left(\frac{({x}_{n}-)^{2}}{2^{2}}\right)}}{\prod_{n=1}^{N} \frac{1}{\sqrt{2\pi ^{2}}}e^{-\left(\frac{{x}_{n}^{2}}{2^{2}}\right)}}=\frac{e^{-\sum_{n=1}^{N} \frac{({x}_{n}-)^{2}}{2^{2}}}}{e^{-\sum_{n=1}^{N} \frac{{x}_{n}^{2}}{2^{2}}}}=e^{\frac{1}{2^{2}}\sum_{n=1}^{N} 2{x}_{n}-^{2}}=e^{\frac{1}{^{2}}(-\left(\frac{N^{2}}{2}\right)+\sum_{n=1}^{N} {x}_{n})}$
To simplify the test further we may apply the natural
logarithm and rearrange terms to obtain
$$t\equiv \sum_{n=1}^{N} {x}_{n}\underset{{}_{0}}{\overset{{}_{1}}{}}\frac{^{2}}{}\ln +\frac{N}{2}\equiv $$
We have used the assumption
$> 0$ . If
$< 0$ , then division by
$$ is not a
monotonically increasing operation, and the inequalitieswould be reversed.
The test statistic
$t$ is
sufficient for the unknown
mean. To set the threshold
$$ , we write the
false-alarm probability (size) as
$${P}_{F}=(t> )=\int {f}_{0}(t)\,d t$$ To evaluate
${P}_{F}$ , we need to know the density of
$t$ under
${}_{0}$ . Fortunately,
$t$ is the sum of normal variates, so it is again normally
distributed. In particular, we have
$t=Ax$ , where
$A=1^T$ , so
$$(t, (A0, A^{2}IA^T)=(0, N^{2}))$$ under
${}_{0}$ . Therefore, we may write
${P}_{F}$ in terms of the
Q-function as
$${P}_{F}=Q(\frac{}{\sqrt{N}})$$ The threshold is thus determined by
$$=\sqrt{N}Q()^{(-1)}$$ Under
${}_{1}$ , we have
$$(t, (A1, A^{2}IA^T)=(N, N^{2}))$$ and so the detection probability (power) is
$${P}_{D}=(t> )=Q(\frac{-N}{\sqrt{N}})$$ Writing
${P}_{D}$ as a function of
${P}_{F}$ , the ROC curve is given by
$${P}_{D}=Q(Q({P}_{F})^{(-1)}-\frac{\sqrt{N}}{})$$ The quantity
$\frac{\sqrt{N}}{}$ is called the
signal-to-noise
ratio . As its name suggests, a larger SNR
corresponds to improved performance of the Neyman-Pearsondecision rule.
In the context of signal processing, the
foregoing problem may be viewed as the problem of detecting aconstant (DC) signal in
additive white
Gaussian noise :
$${}_{0}:{x}_{n}={w}_{n},n=1,,N$$
$${}_{1}:{x}_{n}=A+{w}_{n},n=1,,N$$ where
$A$ is a known, fixed
amplitude, and
$({w}_{n}, (0, ^{2}))$ . Here
$A$ corresponds
to the mean
$$ in the
example.