0.6 Compression of nonparametric sources

Universal algorithms in signal Page 1 / 3

Consider a stationary ergodic source, where we no longer assume that it is parametric. We need to define notions of probability that fit the universal framework. For this,we study Kac's lemma [link] .

We have a stationary source, ${. . ., X_{- n}, X_{- n + 1}, . . ., X_{0}, X_{1}, . . .}$ , $z = X_{- ℓ + 1}^{0} = {X_{- ℓ + 1}, X_{- ℓ + 2}, . . ., X_{0}}$ . Define $N_{r}$ as the number of shifts forward of a window of length $ℓ$ until we see $z$ again; this is called recurrence time. Define

Y_{k} = \{\begin{matrix} 1 & X_{k - ℓ + 1}^{k} = z \\ 0 & else \end{matrix}),

e.g., $Y_{0} = 1$ , then $N_{r}$ is the smallest positive number for which $Y_{k} = 1$ . Note that ${Y_{k}}_{- \infty}^{+ \infty}$ is a binary stationary ergodic source. Define $Q_{k} = Pr {Y_{k} = 1; 1 \leq j \leq k - 1, Y_{j} = 0 | Y_{0} = 1}$ . Then the average recurrence time can be computed, $μ = \sum_{ℓ = 1}^{\infty} ℓ Q_{ℓ} = E [N_{r}]$ . We can now present Kac's lemma.

Lemma 3 [link] $μ = \frac{1}{Pr (Y = 1)}$ and $E [N_{r} = μ | X_{- ℓ + 1}^{0} = z]$ = $\frac{1}{Pr (z)}$ .

Let $A = {Y_{n} = 1 for some - \infty < n < + \infty}$ . Because $z$ just appeared, then its probability is positive. We will prove that $μ = \frac{Pr (A)}{Pr (Y_{0} = 1)}$ .

Define $B^{+} = {Y_{n} = 1, for some 0 \leq n < \infty}$ , and $B^{-} = {Y_{n} = 1, for some n < 0}$ . Then $A = B^{+} ⋃ B^{-} = (B^{+} ⋂ B^{-}) ⋃ (B^{+} ⋂ {(B^{-})}^{C}) ⋃ ({(B^{+})}^{C} ⋂ B^{-})$ . We claim that $Pr (B^{+} ⋂ {(B^{-})}^{C}) = Pr ({(B^{+})}^{C} ⋂ B^{-}) = 0$ . This can be shown formally, but is easily seen by realizing that if $z$ appears at any time $n$ (say, positive) then it must appear at some negative time $n$ with probability 1, because its probability is positive.

Therefore, we have

\begin{matrix} Pr (A) & = Pr (B^{+} ⋂ B^{-}) \\ = \sum_{j = 0}^{\infty} \sum_{k = 1}^{\infty} Pr (Y_{j} = 1, Y_{- k} = 1, Y_{n} = 0, - k < n < j) \\ = \sum_{j = 0}^{\infty} \sum_{k = 1}^{\infty} Pr (Y_{- k} = 1) Pr (Y_{j} = 1, Y_{n} = 0, - k < n < j | Y_{- k} = 1) \\ = \sum_{j = 0}^{\infty} \sum_{k = 1}^{\infty} Pr (Y_{- k} = 1) Q_{j + k} \\ = \sum_{j = 0}^{\infty} \sum_{k = 1}^{\infty} Pr (Y_{0} = 1) Q_{j + k} \\ = Pr (Y_{0} = 1) \sum_{i = 1}^{\infty} i Q_{i} \\ = Pr (Y_{0} = 1) μ . \end{matrix}

Therefore, $μ = \frac{Pr (A)}{Pr (Y_{0} = 1)} .$ We conclude the proof by noting that $Pr (A) = 0$ .

Let us now develop a universal coding technique for stationary sources. Recall $H_{ℓ} = \frac{1}{ℓ} E [- log (Pr (X_{1}^{ℓ}))]$ . The asymptotic equipartition property (AEP) of information theory [link] gives

Pr (X_{1}^{ℓ} : |- \frac{1}{ℓ} log (Pr (X_{1}^{ℓ})) - H_{ℓ}| > δ) < ϵ (δ, ℓ),

where ${lim}_{ℓ \to \infty} ϵ (δ, ℓ) = 0$ . Define in this context a typical set $T (δ, ℓ)$ that satisfies

2^{- (H_{ℓ} + δ) ℓ} \leq Pr (X_{1}^{ℓ}) \leq 2^{- (H_{ℓ} - δ) ℓ} .

For a typical sequence $z$ , $E [N_{r} | X_{1}^{ℓ} = z] \leq 2^{ℓ (H_{ℓ} + δ)}$ . Then

\begin{matrix} Pr (\frac{log (N_{r})}{ℓ} \geq H_{ℓ} + 2 δ) & = Pr (z \in T (δ, ℓ)) Pr (\frac{log (N_{r})}{ℓ} \geq H_{ℓ} + 2 δ | z \in T (δ, ℓ)) \\ + Pr (z \notin T (δ, ℓ)) Pr (\frac{log (N_{r})}{ℓ} \geq H_{ℓ} + 2 δ | z \notin T (δ, ℓ)) \\ \leq 2^{- δ ℓ} + ϵ (δ, ℓ) \overset{AEP}{\to} 0 . \end{matrix}

Consider our situation, we have a source with memory of length $n$ and want to transmit $X_{1}^{ℓ}$ .

Choose $n = 2^{ℓ (H_{ℓ} + 2 δ)}$ .
For $z = X_{1}^{ℓ}$ , find the value of $N_{r}$ if it appears in memory.
If it appears, then transmit a 0 flag bit followed by the value of $N_{r}$ .
Else transmit a 1 followed by the uncompressed $z$ .

Transmitting $z$ via $N_{r}$ requires $⌈ log (N_{r}) ⌉$ bits, and so the expected coding length is

\begin{matrix} Pr (\frac{log (N_{r})}{ℓ} \leq H_{ℓ} + 2 δ) (1 + 1 + log (N_{r}) + 2^{- δ ℓ} (1 + ℓ log α)) + ϵ (ℓ, δ) (1 + 1 + ℓ log α) \\ \leq (2 + ℓ (H_{ℓ} + δ)) + 2^{- δ ℓ} (1 + ℓ log α) + ϵ (ℓ, δ) (2 + ℓ log α) . \end{matrix}

After we normalize by $ℓ$ , the per symbol length converges to $\frac{2}{ℓ} + H_{ℓ} + 2 δ$ .

Note that this analysis assumes that the entropy $H_{ℓ}$ for a block of $ℓ$ symbols is known. If not, then we can have several sets that are each adjusted for some differententropy level.

Universal coding of the integers

Coding of an index also appears in other information theoretic problems such as indexing a coset in distributed source coding and a codeword in lossycoding. What are good ways to encode the index?

If the index $n$ lies in the range $n \in {1, . . ., N}$ and $N$ is known, then we can encode the index using $⌈ log (N) ⌉$ bits. However, sometimes the index can be unbounded, or there could be an(unknown) distribution that biases us strongly toward smaller indices. Therefore, we are interested in a universal encoding of the integerssuch that each $n$ is encoded using roughly $log (n)$ bits.

<< Chapter < Page Page > Chapter >>

Read also:

Get Jobilize Job Search Mobile App in your pocket Now!

100% Free Mobile Applications
Receive real-time job alerts and never miss the right job again

Source: OpenStax, Universal algorithms in signal processing and communications. OpenStax CNX. May 16, 2013 Download for free at http://cnx.org/content/col11524/1.1

Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Universal algorithms in signal processing and communications' conversation and receive update notifications?

Ask

	College physics By OpenStax Read Online Course
	27 AP 27 Reproductive System Essay By OpenStax Start Flashcards
	43 Biology 43 Animal Reproduction Development MCQ By OpenStax Start Quiz
	2 Gastrointestinal Pathophysiology Self-Assessment By Laurence Bailen Start Assessment
	American Politics MCQ By Nicole Bartels Start Quiz
	Foundations of Software Engineering By Kevin Amaratunga Start Quiz
	Are you part of the R5 Family? By Anonymous User Start Quiz
	Subject-verb Agreement By Dindin Secreto Start Quiz
	Computer Skills Literacy MCQ By LaToya Trowers Start Quiz
	Who Said This Quote Quiz By Yasser Ibrahim Start Quiz