0.2 Source models

Universal algorithms in signal Page 1 / 4

For i.i.d. sources, $D (P_{1} (x^{n}) | | P_{2} (x^{n})) = n D (P_{1} (x_{i}) | | P_{2} (x_{i}))$ , which means that the divergence increases linearly with $n$ . Not only does the divergence increase, but it does so by a constant per symbol.Therefore, based on typical sequence concepts that we have seen, for an $x^{n}$ generated by $P_{1}$ , its probability under $P_{2}$ vanishes. However, we can construct a distribution $Q$ whose divergence with both $P_{1}$ abd $P_{2}$ is small,

Q (x^{n}) = \frac{1}{2} P_{1} (x^{n}) + \frac{1}{2} P_{2} (x^{n}) .

We now have for $P_{1}$ ,

\begin{matrix} \frac{1}{n} D (P_{1}^{n} | | Q) & = & \frac{1}{n} E [log \frac{P_{1} (x^{n})}{\frac{1}{2} P_{1} (x^{n}) + \frac{1}{2} P_{2} (x^{n})}] \\ \leq & \frac{1}{n} log (2) = \frac{1}{n} . \end{matrix}

On the other hand, $\frac{1}{n} D (P_{1} (x_{1}^{n}) | | Q (x_{1}^{n})) \geq 0$ [link] , and so

\frac{1}{n} \geq \frac{1}{n} D (P_{1} (x_{1}^{n}) | | Q (x_{1}^{n})) \geq 0 .

By symmetry, we see that $Q$ is also close to $P_{2}$ in the divergence sense.

Intuitively, it might seem peculiar that $Q$ is close to both $P_{1}$ and $P_{2}$ but they are far away from each other (in divergence terms). This intuition stems from the triangle inequality, which holds for all metrics. The contradiction is resolved by realizingthat the divergence is not a metric, and it does not satisfy the triangle inequality.

Note also that for two i.i.d. distributions $P_{1}$ and $P_{2}$ , the divergence

D (P_{1} (x^{n}) | | P_{2} (x^{n})) = n D (P_{1} | | P_{2})

is linear in $n$ . If $Q$ were i.i.d., then $D (P_{1} (x^{n}) ∥ Q (x_{1}^{n}))$ must also be linear in $n$ . But the divergence is not increasing linearly in $n$ , it is upper bounded by 1. Therefore, we conclude that $Q (\cdot)$ is not an i.i.d. distribution. Instead, $Q$ is a distribution that contains memory, and there is dependence in $Q$ between collections of different symbols of $x$ in the sense that they are either all drawn from $P_{1}$ or all drawn from $P_{2}$ . To take this one step further, consider $K$ sources with

Q (x^{n}) = \sum_{i = 1}^{K} \frac{1}{K} P_{i} (x^{n}),

then in an analogous manner to before it can be shown that

D (P_{i} (x_{1}^{n}) | | Q (x_{1}^{n})) \leq \frac{1}{n} log (K) .

Sources with memory : Instead of the memoryless (i.i.d.) source,

P (x^{n}) = \prod_{i = 1}^{n} P (x_{i}),

let us now put forward a statistical model with memory,

P (x^{n}) = \prod_{i = 1}^{n} P (x_{i} | x_{1}^{i - 1}) .

Stationary source : To understand the notion of a stationary source, consider an infinite stream of symbols, $. . ., x_{- 1}, x_{0}, x_{1}, ...$ . A complete probabilistic description of a stationary distribution is given by the collection of allmarginal distribution of the following form for all $t$ and $n$ ,

P_{X_{t}, X_{t + 1}, . . ., X_{t + n - 1}} (x_{t}, x_{t + 1}, . . ., x_{t + n - 1}) .

For a stationary source, this distribution is independent of $t$ .

Entropy rate : We have defined the first order entropy of an i.i.d. random variable [link] , and let us discuss more advanced concepts for sources with memory.Such definitions appear in many standard textbooks, for example that by Gallager [link] .

The order- $n$ entropy is defined,
$H_{n} = \frac{1}{n} H (x_{1}, . . ., x_{n}) = - \frac{1}{n} E [log (P (x_{1}, . . ., x_{n}))] .$
The entropy rate is the limit of order- $n$ entropy, $\bar{H} = {lim}_{n \to \infty} H_{n}$ . The existence of this limit will be shown soon.
Conditional entropy is defined similarly to entropy as the expectation of the log of the conditional probability,
$H (x_{n} | x_{1}, . . ., x_{n - 1}) = - \frac{1}{n} E [log (P (x_{n} | x_{1}, . . ., x_{n - 1}))],$
where expectation is taken over the joint probability space, $P (x_{1}, . . ., x_{n})$ .

The entropy rate also satisfies $\bar{H} = {lim}_{n \to \infty} H (x_{n} | x_{1}, . . ., x_{n})$ .

Theorem 3 For a stationary source with bounded first order entropy, $H_{1} (x) < \infty$ , the following hold.

The conditional entropy $H (x_{n} | x_{1}, . . ., x_{n - 1})$ is monotone non-increasing in n.
The order- $n$ entropy is not smaller than the conditional entropy,
$H_{n} (x) \geq H (x_{n} | x_{1}, . . ., x_{n - 1}) .$
The order- $n$ entropy $H_{n} (x)$ is monotone non-increasing.
$\bar{H} (x) = {lim}_{n \to \infty} H_{n} (x) = {lim}_{n \to \infty} H (x_{n} | x_{1}, . . ., x_{n - 1})$ .

<< Chapter < Page Page > Chapter >>

Read also:

Get Jobilize Job Search Mobile App in your pocket Now!

100% Free Mobile Applications
Receive real-time job alerts and never miss the right job again

Source: OpenStax, Universal algorithms in signal processing and communications. OpenStax CNX. May 16, 2013 Download for free at http://cnx.org/content/col11524/1.1

Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Universal algorithms in signal processing and communications' conversation and receive update notifications?

Ask

	18 Dr. Dowers Endocrinology Quiz 2 By Brooke Delaney Start Exam
	3 CDL Quiz - General Knowledge Part 1 By Jazzycazz Jackson Start Quiz
	Chemistry Practice Test By Sandhills MLT Start Test
©flickr: iClassical	Music Apperciation By Courntey Hub Start Test
	Anthropology Religion Culture By Richley Crapo Start Assignment
	Cardiac Electrophysiology Basic By Mistry Bhavesh Start Quiz
	MAPEH TEST By Edgar Delgado Start Test
	28 AP 28 Development Inheritance Essay By OpenStax Start Flashcards
	Negotiations Conflict Management BUS403 By Charles Jumper Start Quiz
	3 Microbiology Final 3 By Madison Christian Start Test