# Basic elements of statistical decision theory and statistical  (Page 2/5)

 Page 2 / 5

In decision making problems, we know the value of the observation, but do not know the value $y$ . Therefore, it is appealing to consider the conditional density or pmf as a function of the unknown values $y$ , with $X$ fixed at its observed value. The resulting function is called the likelihood function. As the name suggests, values of $y$ where the likelihood function is largest are intuitively reasonable indicators of the true value of the unknown quantity, which we will denoteby ${y}^{*}$ . The rationale for this is that these values would produce conditional densities or pmfs that place high probability on theobservation $X=x$ .

The Maximum Likelihood Estimator (MLE) is defined to be the value of $y$ that maximizes the likelihood function; i.e., in the continuous case

$\stackrel{^}{y}\left(X\right)\phantom{\rule{4pt}{0ex}}=\phantom{\rule{4pt}{0ex}}arg\underset{y}{max}{p}_{X|Y}\left(X|y\right)$

with an analogous definition for the discrete case by replacing the conditional densitywith the conditional pmf. The decision rule $\stackrel{^}{y}\left(X\right)$ is called an “estimator,” which is common in decision problemsinvolving a continuous parameter. Note that maximizing the likelihood function is equivalent to minimizing the negative log-likelihoodfunction (since the logarithm is a monotonic transformation). Now let ${y}^{*}$ denote the true value of $Y$ . Then we can view the negative log-likelihood as a loss function

${\ell }_{L}\left(y,{y}^{*}\right)\phantom{\rule{4pt}{0ex}}=\phantom{\rule{4pt}{0ex}}-log{p}_{X|Y}\left(X|y\right)$

where the dependence on ${y}^{*}$ on the right hand side is embodied in the observation $X$ on the left. An interesting special case of the MLE results when the conditional density ${P}_{X|Y}\left(X|y\right)$ is a Gaussian, in which case the negative log-likelihood corresponds to a squared errorloss function.

Now let us consider the expectation of this loss, with respect to the conditional distribution ${P}_{X|Y}\left(X|{y}^{*}\right)$ :

$\begin{array}{ccc}\hfill -E\left[log{p}_{X|Y}\left(X|y\right)\right]& =& \int log\left(\frac{1}{{p}_{X|Y}\left(x|y\right)}\right){p}_{X|Y}\left(x|{y}^{*}\right)dx\hfill \end{array}$

The true value ${y}^{*}$ minimizes the expected negative log-likelihood (or, equivalently, maximizes the expected log-likelihood ). To seethis, compare the expected log-likelihood of ${y}^{*}$ with that of any other value $y$ :

$\begin{array}{ccc}\hfill E\left[log{p}_{X|Y}\left(X|{y}^{*}\right)-log{p}_{X|Y}\left(X|y\right)\right]& =& E\left[log,\left(\frac{{p}_{X|Y}\left(X|{y}^{*}\right)}{{p}_{X|Y}\left(X|y\right)}\right)\right]\hfill \\ & =& \int log\left(\frac{{p}_{X|Y}\left(x|{y}^{*}\right)}{{p}_{X|Y}\left(x|y\right)}\right){p}_{X|Y}\left(x|{y}^{*}\right)dx\hfill \\ & =& \text{KL}\left({p}_{X|Y}\left(x|{y}^{*}\right),{p}_{X|Y}\left(x|y\right)\right)\hfill \end{array}.$

The quantity $\text{KL}\left({p}_{X|Y}\left(x|{y}^{*}\right),{p}_{X|Y}\left(x|y\right)\right)$ is called the Kullback-Leibler (KL) divergence between the conditional densityfunction ${p}_{X|Y}\left(x|{y}^{*}\right)$ and ${p}_{X|Y}\left(x|y\right)$ . The KL divergence is non-negative, and zero if and only if the two densities are equal [link] . So, we see that the KL divergence acts as a sort of risk function in the context of Maximum Likelihood Estimation.

## The cramer-rao lower bound

The MLE is based on finding the value for $Y$ that maximizes the likelihood function. Intuitively, if the maximum point is verydistinct, say a well isolated peak in the likelihood function, then the easier it will be to distinguish the MLE from alternativedecisions. Consider the case in which $Y$ is a scalar quantity. The “peakiness” of the log-likelihood function can be gauged byexamining its curvature, $-\frac{{\partial }^{2}log{p}_{X|Y}\left(x|y\right)}{\partial {y}^{2}}$ , at the point of maximum likelihood. The higher the curvature, the more peaky is the behavior of the likelihood functionat the maximum point. Of course, we hope that the MLE will be a good predictor (decision) for the unknown true value ${y}^{*}$ . So, rather than looking at the curvature of the log-likelihood function at themaximum likelihood point, a more appropriate measure of how easily it will be to distinguish ${y}^{*}$ from the alternatives is the expected curvature of the log-likelihood function evaluated at the value ${y}^{*}$ . The expectation taken over all possible observations with respect tothe conditional density ${p}_{X|Y}\left(x|{y}^{*}\right)$ . This quantity, denoted $I\left({y}^{*}\right)=E\left[-\frac{{\partial }^{2}log{p}_{X|Y}\left(x|y\right)}{\partial {y}^{2}}\right]{|}_{y={y}^{*}}$ , is called the Fisher Information (FI). In fact, the FI provides us with an important performance bound known asthe Cramer-Rao Lower Bound (CRLB).

what is math number
x-2y+3z=-3 2x-y+z=7 -x+3y-z=6
Need help solving this problem (2/7)^-2
x+2y-z=7
Sidiki
what is the coefficient of -4×
-1
Shedrak
the operation * is x * y =x + y/ 1+(x × y) show if the operation is commutative if x × y is not equal to -1
An investment account was opened with an initial deposit of $9,600 and earns 7.4% interest, compounded continuously. How much will the account be worth after 15 years? Kala Reply lim x to infinity e^1-e^-1/log(1+x) given eccentricity and a point find the equiation Moses Reply 12, 17, 22.... 25th term Alexandra Reply 12, 17, 22.... 25th term Akash College algebra is really hard? Shirleen Reply Absolutely, for me. My problems with math started in First grade...involving a nun Sister Anastasia, bad vision, talking & getting expelled from Catholic school. When it comes to math I just can't focus and all I can hear is our family silverware banging and clanging on the pink Formica table. Carole I'm 13 and I understand it great AJ I am 1 year old but I can do it! 1+1=2 proof very hard for me though. Atone hi Adu Not really they are just easy concepts which can be understood if you have great basics. I am 14 I understood them easily. Vedant hi vedant can u help me with some assignments Solomon find the 15th term of the geometric sequince whose first is 18 and last term of 387 Jerwin Reply I know this work salma The given of f(x=x-2. then what is the value of this f(3) 5f(x+1) virgelyn Reply hmm well what is the answer Abhi If f(x) = x-2 then, f(3) when 5f(x+1) 5((3-2)+1) 5(1+1) 5(2) 10 Augustine how do they get the third part x = (32)5/4 kinnecy Reply make 5/4 into a mixed number, make that a decimal, and then multiply 32 by the decimal 5/4 turns out to be AJ how Sheref can someone help me with some logarithmic and exponential equations. Jeffrey Reply sure. what is your question? ninjadapaul 20/(×-6^2) Salomon okay, so you have 6 raised to the power of 2. what is that part of your answer ninjadapaul I don't understand what the A with approx sign and the boxed x mean ninjadapaul it think it's written 20/(X-6)^2 so it's 20 divided by X-6 squared Salomon I'm not sure why it wrote it the other way Salomon I got X =-6 Salomon ok. so take the square root of both sides, now you have plus or minus the square root of 20= x-6 ninjadapaul oops. ignore that. ninjadapaul so you not have an equal sign anywhere in the original equation? ninjadapaul hmm Abhi is it a question of log Abhi 🤔. Abhi I rally confuse this number And equations too I need exactly help salma But this is not salma it's Faiza live in lousvile Ky I garbage this so I am going collage with JCTC that the of the collage thank you my friends salma Commplementary angles Idrissa Reply hello Sherica im all ears I need to learn Sherica right! what he said ⤴⤴⤴ Tamia hii Uday hi salma hi Ayuba Hello opoku hi Ali greetings from Iran Ali salut. from Algeria Bach hi Nharnhar A soccer field is a rectangle 130 meters wide and 110 meters long. The coach asks players to run from one corner to the other corner diagonally across. What is that distance, to the nearest tenths place. Kimberly Reply Jeannette has$5 and \$10 bills in her wallet. The number of fives is three more than six times the number of tens. Let t represent the number of tens. Write an expression for the number of fives.
What is the expressiin for seven less than four times the number of nickels
How do i figure this problem out.
how do you translate this in Algebraic Expressions
why surface tension is zero at critical temperature
Shanjida
I think if critical temperature denote high temperature then a liquid stats boils that time the water stats to evaporate so some moles of h2o to up and due to high temp the bonding break they have low density so it can be a reason
s.
Need to simplify the expresin. 3/7 (x+y)-1/7 (x-1)=
. After 3 months on a diet, Lisa had lost 12% of her original weight. She lost 21 pounds. What was Lisa's original weight?
how did you get the value of 2000N.What calculations are needed to arrive at it
Privacy Information Security Software Version 1.1a
Good
Got questions? Join the online conversation and get instant answers!