0.3 Universal coding for classes of sources

Universal algorithms in signal Page 1 / 3

We have discussed several parametric sources, and will now start developing mathematical tools in order to investigate properties of universal codes that offer universal compression w.r.t.a class of parametric sources.

Preliminaries

Consider a class $Λ$ of parametric models, where the parameter set $θ$ characterizes the distribution for a specific source within this class, ${p_{θ} (\cdot), θ \in Λ}$ .

Consider the class of memoryless sources over an alphabet $α = {1, 2, . . ., r}$ . Here we have

θ = {p (1), p (2), . . ., p (r - 1)} .

The goal is to find a fixed to variable length lossless code that is independent of $θ$ , which is unknown, yet achieves

E_{θ} \frac{l (X_{1}^{n})}{n} \overset{n \to \infty}{\to} \bar{H_{θ}} (X),

where expectation is taken w.r.t. the distribution implied by $θ$ . We have seen for

p (x) = \frac{1}{2} p_{1} (x) + \frac{1}{2} p_{2} (x)

that a code that is good for two sources (distributions) $p_{1}$ and $p_{2}$ exists, modulo the one bit loss [link] . As an expansion beyond this idea, consider

p (x) = \int_{Λ} d w (θ) p_{θ} (X),

where $w (θ)$ is a prior.

Let us revisit the memoryless source, choose $r = 2$ , and define the scalar parameter

θ = Pr (X_{i} = 1) = 1 - Pr (X_{i} = 0) .

Then

p_{θ} (x) = θ^{n_{X} (1)} \cdot {(1 - θ)}^{n_{X} (0)}

and

p (x) = \int_{0}^{1} d θ \cdot θ^{n_{X} (1)} \cdot {(1 - θ)}^{n_{X} (0)} .

Moreover, it can be shown that

p (x) = \frac{n_{X} (0)! n_{X} (1)!}{(n + 1)!},

this result appears in Krichevsky and Trofimov [link] .

Is the source $X$ implied by the distribution $p (x)$ an ergodic source? Consider the event ${lim}_{n \to \infty} \frac{1}{n} \sum_{i = 1}^{n} X_{i} \leq \frac{1}{2}$ . Owing to symmetry, in the limit of large $n$ the probability of this event under $p (x)$ must be $\frac{1}{2}$ ,

Pr \{lim_{n \to \infty} \frac{1}{n} \sum_{i = 1}^{n} X_{i} \leq \frac{1}{2}\} = \frac{1}{2} .

On the other hand, recall that an ergodic source must allocate probability 0 or 1 to this flavor of event. Therefore, the source implied by $p (x)$ is not ergodic.

Recall the definitions of $p_{θ} (x)$ and $p (x)$ in [link] and [link] , respectively. Based on these definitions, consider the following,

\begin{matrix} H_{θ} (X_{1}^{n}) & = & - \sum_{X_{1}^{n} \in A^{n}} p_{θ} (X_{1}^{n}) log p_{θ} (X_{1}^{n}) = H (X_{1}^{n} | Θ = θ), \\ H (X_{1}^{n}) & = & - \sum_{X_{1}^{n}} p (X_{1}^{n}) log p (X_{1}^{n}), \\ H (X_{1}^{n} | Θ) & = & \int_{Λ} d w (θ) \cdot H (X_{1}^{n} | Θ = θ) . \end{matrix}

We get the following quantity for mutual information between the random variable $Θ$ and random sequence $X_{1}^{N}$ ,

I (Θ; X_{1}^{n}) = H (X_{1}^{n}) - H (X_{1}^{n} | Θ) .

Note that this quantity represents the gain in bits that the parameter $θ$ creates; more about this quantity will be mentioned later.

Redundancy

We now define the conditional redundancy ,

r_{n} (l, θ) = \frac{1}{n} [E_{θ} (l (X_{1}^{n})) - H_{θ} (X_{1}^{n})],

this quantifies how far a coding length function $l$ is from the entropy where the parameter $θ$ is known. Note that

l (X_{1}^{n}) = \int_{Λ} d w (θ) E_{θ} (l (X_{1}^{n})) \geq H (X_{1}^{n} | θ) .

Denote by $c_{n}$ the collection of lossless codes for length- $n$ inputs, and define the expected redundancy of a code $l \in C_{n}$ by

\begin{matrix} R_{n}^{-} (w, l) & = & \int_{Λ} d w (θ) r_{n} (l, θ), \\ R_{n}^{-} (w) & = & inf_{l \in C_{n}} R_{n}^{-} (w, l) . \end{matrix}

The asymptotic expected redundancy follows,

R^{-} (w) = lim_{n \to \infty} R_{n}^{-} (w),

assuming that the limit exists.

We can also define the minimum redundancy that incorporates the worst prior for parameter,

R_{n}^{-} = sup_{w \in W} R_{n}^{-} (w),

while keeping the best code. Similarly,

R^{-} = lim_{n \to \infty} R_{n}^{-} .

Let us derive $R_{n}^{-}$ ,

\begin{matrix} R_{n}^{-} & = & sup_{w} inf_{l} \int_{Λ} d w (θ) \frac{1}{n} [E_{θ} (l (X_{1}^{n})) - H (X_{1}^{n} | Θ = θ)] \\ = & sup_{w} inf_{l} \frac{1}{n} E_{p} [l (X_{1}^{n}) - H (X_{1}^{n} | Θ)] \\ = & sup_{w} \frac{1}{n} [H (X_{1}^{n}) - H (X_{1}^{n} | Θ)] \\ = & sup_{w} \frac{1}{n} I (Θ; X_{1}^{n}) = \frac{C_{n}}{n}, \end{matrix}

where $C_{n}$ is the capacity of a channel from the sequence $x$ to the parameter [link] . That is, we try to estimate the parameter from the noisy channel.

In an analogous manner, we define

\begin{matrix} R_{n}^{+} & = & inf_{l} sup_{θ \in Λ} r_{n} (l, θ) \\ = & inf_{l} sup_{θ} \frac{1}{n} E_{θ} [log \frac{p_{θ} (x^{n})}{2^{- l (x^{n})}}] \\ = & inf_{Q} sup_{θ} \frac{1}{n} D (P_{θ} | | Q), \end{matrix}

<< Chapter < Page Page > Chapter >>

Read also:

Get Jobilize Job Search Mobile App in your pocket Now!

100% Free Mobile Applications
Receive real-time job alerts and never miss the right job again

Source: OpenStax, Universal algorithms in signal processing and communications. OpenStax CNX. May 16, 2013 Download for free at http://cnx.org/content/col11524/1.1

Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Universal algorithms in signal processing and communications' conversation and receive update notifications?

Ask

©flickr: Derek	Treatment Of Psychological Disorders By Michael Nelson Start Quiz
	1 Endocrine System MCQ By Nick Swain Start Quiz
	Anthropology Culture Personality By Richley Crapo Start Assignment
	NCE Ch 06 Groups By Anh Dao Start Quiz
	Clinical Psychology MCQ By Saylor Foundation Start Quiz
	Time and Stress Management MCQ By Dionne Mahaffey Start Quiz
	7 Microbiology Unit 3 By Madison Christian Start Test
	Statistics Final Review By Madison Christian Start Exam
	12 Biology 12 Mendel's Experiments Heredity MCQ By OpenStax Start Quiz
©flickr: Eduardo	Nursing Education NU589 By Sandy Yamane Start Test