<< Chapter < Page | Chapter >> Page > |

The Source Coding Theorem states that the entropy of an alphabet of symbols specifies to within one bit how many bits on the average need to be used to send the alphabet.

The significance of an alphabet's entropy rests in how we can
represent it with a sequence of
bits . Bit
sequences form the "coin of the realm" in digitalcommunications: they are the universal way of representing
symbolic-valued signals. We convert back and forth betweensymbols to bit-sequences with what is known as a
codebook : a table that associates symbols to bit
sequences. In creating this table, we must be able to assign a
*unique* bit sequence to each symbol so that
we can go between symbol and bit sequences without error.

You may be conjuring the notion of hiding information from
others when we use the name codebook for thesymbol-to-bit-sequence table. There is no relation to
cryptology, which comprises mathematically provable methods ofsecuring information. The codebook terminology was developed
during the beginnings of information theory just after WorldWar II.

As we shall explore in some detail elsewhere,
digital communication is
the transmission of symbolic-valued signals from one place toanother. When faced with the problem, for example, of sending
a file across the Internet, we must first represent eachcharacter by a bit sequence. Because we want to send the file
quickly, we want to use as few bits as possible. However, wedon't want to use so few bits that the receiver cannot
determine what each character was from the bit sequence. Forexample, we could use one bit for every character: File
transmission would be fast but useless because the codebookcreates errors. Shannon
proved in his monumental work what we call today the
Source Coding Theorem . Let
$B({a}_{k})$ denote the number of bits used to represent the symbol
${a}_{k}$ . The average number of bits
$\langle B(A)\rangle $ required to represent the entire alphabet equals
$\sum_{k=1}^{K} B({a}_{k})({a}_{k})$ .
*The Source Coding Theorem states that the
average number of bits needed to* accurately
*represent the alphabet need only to satisfy*

$$H(A)\le \langle B(A)\rangle < H(A)+1$$

Thus, the alphabet's entropy specifies to within one bit how
many bits on the average need to be used to send the alphabet.The smaller an alphabet's entropy, the fewer bits required for
digital transmission of files expressed in that alphabet.
A four-symbol alphabet has the following probabilities. $$({a}_{0})=\frac{1}{2}$$ $$({a}_{1})=\frac{1}{4}$$ $$({a}_{2})=\frac{1}{8}$$ $$({a}_{3})=\frac{1}{8}$$ and an entropy of 1.75 bits . Let's see if we can find a codebook for this four-letter alphabet that satisfies the Source CodingTheorem. The simplest code to try is known as the simple binary code : convert the symbol's index into a binary number and use the same number of bits for each symbol byincluding leading zeros where necessary.

$\leftrightarrow ({a}_{0}, \mathrm{00})\text{}\leftrightarrow ({a}_{1}, \mathrm{01})\text{}\leftrightarrow ({a}_{2}, \mathrm{10})\text{}\leftrightarrow ({a}_{3}, \mathrm{11})$

Whenever the number of symbols in the alphabet is a power oftwo (as in this case), the average number of bits
$\langle B(A)\rangle $ equals
$\log_{2}K$ , which equals
$2$ in this case. Because the entropy equals
$1.75$ bits, the simple
binary code indeed satisfies the Source Coding Theorem—we arewithin one bit of the entropy limit—but you might wonder if
you can do better. If we choose a codebook with differingnumber of bits for the symbols, a smaller average number of
bits can indeed be obtained. The idea is to use shorter bitsequences for the symbols that occur more often. One codebook
like this is
$\leftrightarrow ({a}_{0}, 0)\text{}\leftrightarrow ({a}_{1}, \mathrm{10})\text{}\leftrightarrow ({a}_{2}, \mathrm{110})\text{}\leftrightarrow ({a}_{3}, \mathrm{111})$

Now
$\langle B(A)\rangle =1\xb7\frac{1}{2}+2\xb7\frac{1}{4}+3\xb7\frac{1}{8}+3\xb7\frac{1}{8}=1.75$ . We can reach the entropy limit! The simple
binary code is, in this case, less efficient than theunequal-length code. Using the efficient code, we can transmit
the symbolic-valued signal having this alphabet 12.5%faster. Furthermore, we know that no more efficient codebook
can be found because of Shannon's Theorem. -
100% Free
*Android Mobile*Application - Receive real-time job alerts and never miss a matching job again

Source:
OpenStax, Fundamentals of electrical engineering i. OpenStax CNX. Aug 06, 2008 Download for free at http://legacy.cnx.org/content/col10040/1.9

Google Play and the Google Play logo are trademarks of Google Inc.

*Notification Switch*

Would you like to follow the *'Fundamentals of electrical engineering i'* conversation and receive update notifications?