<< Chapter < Page | Chapter >> Page > |
We note in past that there is no need to spend a bit to encode leaves of depth $D$ . To see this, consider a procedure for encoding the structure of a tree:
Consider the tree sourced depicted in [link] . In order to encode the structure of this tree, we will utilize the followingprocedure. (Such a procedure has appeared, for example, in [link] .)
Start from root. [procedure(root)]
1. If node
$S$ is of depth
$D$ (maximum), then return.
2. If node
$S$ is internal node, then {
3. return.
Let us now simulate the procedure, the procedure will traverse through the following states of the tree in [link] while outputting the corresponding bits.
Source | root | 0 | 1 | 01 | 001 | 101 | 11 |
Encoded symbol | 0 | 1 | 0 | 0 | 1 |
Returning to tree pruning, following [link] we see that we must initialize $\text{MDL}\left(s\right)=\text{KT}({n}_{x}(s,0),{n}_{x}(s,1))$ for $s$ of full depth $\left|s\right|=D$ without the extra bit.
At the end of the pruning procedure, ${T}_{\left\{\right\}}^{*}$ the maximizing tree for the root, will be the optimal tree for universal coding.
The Burrows Wheeler transform (BWT) was proposed by Burrows and Wheeler in 1994 [link] (see also the analysis by Effros et al. [link] and references therein). It is an invertible permutation sort that sorts symbols according to their contexts. Thatway, the symbols that were generated by the same state of the context tree are grouped together, which as we will see is advantageous.
To compute the BWT, we first compute all cyclical shifts of the input $x$ . Next, we sort the cyclical shifts.The output of the BWT consists of $y$ , the last column of the matrix of sorted shifts, and $i$ the index of the original version. We illustrate with an example.
Consider the input $x=banana$ . First, we compute the cyclic shifts and their sorts.
All Shifts | Sorted |
banana | abanan |
abanan | anaban |
nabana | ananab |
anaban | banana |
nanaba | nabana |
ananab | nanaba |
The output of the BWT consists of $y=nnbaaa$ , the last column of the matrix of sorted shifts (to the right), and the index $i=4$ containing the original input.
Interestingly, we can recover $x$ from $y$ and $i$ . Seeing that $y$ is structured and thus quite compressible, the BWT can be used as a compression system; a building block that illustrates such a system appearsin [link] .
To see that the BWT is invertible, let us work out how to do this by continuing our example.
In the matrix of sorted shifts, column 1 is a sorted version of column $n$ , which we know.
Column 1 | Column n |
a | n |
a | n |
a | b |
b | a |
n | a |
n | a |
Now take column $n$ and put it before column 1:
Column n | Column 1 |
n | a |
n | a |
b | a |
a | b |
a | n |
a | n |
We now sort these rows, which each consist of 2 symbols: $ab$ , $an$ , $an$ , $ba$ , $na$ , and $na$ . Now fill column 2 of the sorted shifts matrix accordingly.
Columns 1–2 | Column $n$ |
ab | n |
an | n |
an | b |
ba | a |
na | a |
na | a |
The entire matrix can be unraveled, and the row containing the original $x$ is indexed by $i$ .
What is the BWT good for? The key property of the BWT is that symbols generated by the same state are grouped together in $y$ . To see this, note how the last column $n$ can be rotated to a position to the left of column 1, and symbols that came before the same prefix appear together.(To bunch together symbols generated by the same suffix, we can reverse the order of symbols in $x$ before running the BWT.) Therefore, $y$ has the form of a piecewise i.i.d. sequence [link] , where segments generated by the same state of the context tree are bunched together.
Notification Switch
Would you like to follow the 'Universal algorithms in signal processing and communications' conversation and receive update notifications?