<< Chapter < Page | Chapter >> Page > |
Random variables whose spaces are not composed of a countable number of points but are intervals or a union of intervals are said to be of the continuous type . Recall that the relative frequency histogram $h\left(x\right)$ associated with n observations of a random variable of that type is a nonnegative function defined so that the total area between its graph and the x axis equals one. In addition, $h\left(x\right)$ is constructed so that the integral
is an estimate of the probability $P\left(a<X<b\right)$ , where the interval $\left(a,b\right)$ is a subset of the space R of the random variable X .
Let now consider what happens to the function $h\left(x\right)$ in the limit, as n increases without bound and as the lengths of the class intervals decrease to zero. It is to be hoped that $h\left(x\right)$ will become closer and closer to some function, say $f\left(x\right)$ , that gives the true probabilities , such as $P\left(a<X<b\right)$ , through the integral
Let the random variable X be the distance in feet between bad records on a used computer tape. Suppose that a reasonable probability model for X is given by the p.d.f.
$$f\left(x\right)\frac{1}{40}{e}^{-x/40}\mathrm{,0}\le x<\infty .$$
The probability that the distance between bad records is greater than 40 feet is
$$P\left(X>40\right)={\displaystyle \underset{40}{\overset{\infty}{\int}}\frac{1}{40}{e}^{-x/40}dx=}{e}^{-1}=\mathrm{0.368.}$$
The p.d.f. and the probability of interest are depicted in FIG.1 .
We can avoid repeated references to the space R of the random variable X , one shall adopt the same convention when describing probability density function of the continuous type as was in the discrete case.
Let extend the definition of the p.d.f. $f\left(x\right)$ to the entire set of real numbers by letting it equal zero when, x belongs to R . For example,
$$f\left(x\right)=\{\begin{array}{l}\frac{1}{40}{e}^{-x/40}\\ \mathrm{0,}elsewhere,\end{array}\mathrm{,0}\le x<\infty ,$$
has the properties of a p.d.f. of a continuous-type random variable x having support $\left(x:0\le x<\infty \right)$ . It will always be understood that $f(x)=0$ , when x belongs to R , even when this is not explicitly written out.
continuing with Example 1
If the p.d.f. of X is
$$f\left(x\right)=\{\begin{array}{l}\mathrm{0,}-\infty <x<\mathrm{0,}\\ \frac{1}{40}{e}^{-x/40}\mathrm{,0}\le x<\infty ,\end{array}$$
The distribution function of X is $F\left(x\right)=0$ for $x\le 0$
$$F\left(x\right)={\displaystyle \underset{-\infty}{\overset{x}{\int}}f\left(t\right)dt={\displaystyle \underset{0}{\overset{x}{\int}}\frac{1}{40}{e}^{-t/40}dt=-{e}^{-t/40}{|}_{0}^{x}=1-{e}^{-x/40}}.}$$
Also $F\text{'}\left(0\right)$ does not exist. Since there are no steps or jumps in a distribution function $F\left(x\right)$ , of the continuous type, it must be true that $$P\left(X=b\right)=0$$ for all real values of b . This agrees with the fact that the integral
$$\underset{a}{\overset{b}{\int}}f\left(x\right)dx$$ is taken to be zero in calculus. Thus we see that $$P\left(a\le X\le b\right)=P\left(a<X<b\right)=P\left(a\le X<b\right)=P\left(a<X\le b\right)=F\left(b\right)-F\left(a\right),$$ provided that X is a random variable of the continuous type. Moreover, we can change the definition of a p.d.f. of a random variable of the continuous type at a finite (actually countable) number of points without alerting the distribution of probability.
For illustration, $$f\left(x\right)=\{\begin{array}{l}\mathrm{0,}-\infty <x<\mathrm{0,}\\ \frac{1}{40}{e}^{-x/40}\mathrm{,0}\le x<\infty ,\end{array}$$ and $$f\left(x\right)=\{\begin{array}{l}\mathrm{0,}-\infty <x\le \mathrm{0,}\\ \frac{1}{40}{e}^{-x/40}\mathrm{,0}<x<\infty ,\end{array}$$
are equivalent in the computation of probabilities involving this random variable.
Let Y be a continuous random variable with the p.d.f. $g\left(y\right)=2y$ , $0<y<1$ . The distribution function of Y is defined by
$$G\left(y\right)=(\begin{array}{l}\mathrm{0,}y<\mathrm{0,}\\ \mathrm{1,}y\ge \mathrm{1,}\\ {\displaystyle \underset{0}{\overset{y}{\int}}2tdt={y}^{2}\mathrm{,0}\le y<1.}\end{array}$$
Figure 2 gives the graph of the p.d.f. $g\left(y\right)$ and the graph of the distribution function $G\left(y\right)$ .
For illustration of computations of probabilities, consider
$$P\left(\frac{1}{2}<Y\le \frac{3}{4}\right)=G\left(\frac{3}{4}\right)-G\left(\frac{1}{2}\right)={\left(\frac{3}{4}\right)}^{2}-{\left(\frac{1}{2}\right)}^{2}=\frac{5}{16}$$
and $$P\left(\frac{1}{4}\le Y<2\right)=G\left(2\right)-G\left(\frac{1}{4}\right)=1-{\left(\frac{1}{4}\right)}^{2}=\frac{15}{16}.$$
For random variables of the continuous type, the p.d.f. does not have to be bounded. The restriction is that the area between the p.d.f. and the x axis must equal one. Furthermore, it should be noted that the p.d.f. of a random variable X of the continuous type does not need to be a continuous function.
For example,
$$f\left(x\right)=\{\begin{array}{l}\frac{1}{\mathrm{2,}}0<x<1or2<x<\mathrm{3,}\\ \mathrm{0,}elsewhere,\end{array}$$
enjoys the properties of a p.d.f. of a distribution of the continuous type, and yet $f\left(x\right)$ had discontinuities at $x=\mathrm{0,1,2,}$ and 3. However, the distribution function associates with a distribution of the continuous type is always a continuous function. For continuous type random variables, the definitions associated with mathematical expectation are the same as those in the discrete case except that integrals replace summations.
FOR ILLUSTRATION , let X be a random variable with a p.d.f. $f\left(x\right)$ . The expected value of X or mean of X is
$$\mu =E\left(X\right)={\displaystyle \underset{-\infty}{\overset{\infty}{\int}}xf\left(x\right)dx.}$$ The variance of X is $${\sigma}^{2}=Var\left(X\right)={\displaystyle \underset{-\infty}{\overset{\infty}{\int}}{\left(x-\mu \right)}^{2}f\left(x\right)dx.}$$
The standard deviation of X is $$\sigma =\sqrt{Var\left(X\right)}.$$
For the random variable Y in the Example 3 .
$$\mu =E\left(Y\right)={\displaystyle \underset{0}{\overset{1}{\int}}y\left(2y\right)dy={\left[\left(\frac{2}{3}{y}^{3}\right)\right]}_{0}^{1}=\frac{2}{3}}$$ and
$$\begin{array}{l}{\sigma}^{2}=Var\left(Y\right)=E\left({Y}^{2}\right)-{\mu}^{2}\\ ={\displaystyle \underset{0}{\overset{1}{\int}}{y}^{2}\left(2y\right)dy-{\left(\frac{2}{3}\right)}^{2}={\left[\left(\frac{1}{2}{y}^{4}\right)\right]}_{0}^{1}-\frac{4}{9}=\frac{1}{18}}.\end{array}$$
Notification Switch
Would you like to follow the 'Introduction to statistics' conversation and receive update notifications?