<< Chapter < Page Chapter >> Page >

In previous chapters, we assumed we knew the mathematical form of the probability distribution for the observations under eachmodel; some of these distribution's parameters were not known and we developed decision rules to deal with this uncertainty. Amore difficult problem occurs when the mathematical form is not known precisely. For example, the data may be approximatelyGaussian, containing slight departures from the ideal. More radically, so little may be known about an accurate model for the data that we are only willing to assume that they are distributed symmetricallyabout some value. We develop model evaluation algorithms in this section that tackle both kinds of problems. However, beforewarned that solutions to such general models come at a price: the more specific a model can be that accuratelydescribes a given problem, the better the performance. In other words, the more specific the model, the more the signalprocessing algorithms can be tailored to fit it with the obvious result that we enhance the performance. However, if our specificmodel is in error, our neatly tailored algorithms can lead us drastically astray. Thus, the best approach is to relax thoseaspects of the model which seem doubtful and to develop algorithms that will cope well with worst-case situations shouldthey arise ("And they usually do," echoes every person experienced in the vagaries of data). These considerations leadus to consider nonparametric variations in the probability densities compatible with out assessment of model accuracy and to derive decision rules that minimize the impact of the worse-case situation.

Worst-case probability distributions

In model evaluation problems, there are "optimally" hard problems, those where the models are the most difficult todistinguish. The impossible problem is to distinguish models that are identical. In this situation, the conditional densities of theobserved data are equal and the likelihood ratio is constant for all possible values of the observations. It is obvious thatidentical models are indistinguishable; this elaboration suggest that in terms of the likelihood ratio, hard problems are those in which the likelihood ratio is constant . Thus, "hard problems" are those in which the class of conditionalprobability densities has a constant ratio for wide ranges of observed data values.

The most relevant model evaluation problem for us is the discrimination between two models thatdiffer only in the means of statistically independent observations: the conditional densities of each observation arerelated as p r l 1 r l p r l 0 r l m . Densities that would make this model evaluation problem hard would satisfy the functional equation x m x m p x m C m p x where C m is quantity depending on the mean m , but not the variable x .

The uniform density does not satisfy this equation as the domainof the function p is assumed to be infinite.
For the probability densities satisfying this equation, any value ofthe observed datum which has a value greater than m cannot be used to distinguish the two models. If one considers only thosezero-mean densities p which are symmetric about the origin, then by symmetry the likelihood ratio would also be constant for x 0 . Hypotheses having these densities could only be distinguished when the oberservations lay in the interval 0 m ; such model evaluation problems are hard!

From the functional equation, we see that the quantity C m must be inversely proportional to p m (substitute x m into the equation). Incorporating this fact into our functional equation, we find that the only solution is the exponential function. z z 0 p z m C m p z p z z If we insist that the density satisfying the functional equation by symmetric, the solution is the so-calledLaplacian (or double-exponential) density. p z z 1 2 2 z 2 2 When this density serves as the underlying density for our hard model-testing problem, the likelihood ratio hasthe form ( Huber; 1965 , Huber; 1981 , Poor pp.175-187 ) r l m 2 2 r l 0 2 r l m 2 2 0 r l m m 2 2 m r l Indeed, the likelihood ratio is constant over much of the range of values of r l , implying that the two models are very similar over those ranges. This worst-case result will appearrepeatedly as we embark on searching for the model evaluation rules that minimize the effect of modeling errorson performance.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Statistical signal processing. OpenStax CNX. Dec 05, 2011 Download for free at http://cnx.org/content/col11382/1.1
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Statistical signal processing' conversation and receive update notifications?

Ask