2023-06-17

Information Theory

Table of Contents

Information theory was originally developed to measure expected length of messages for optimal code in communication. This deals with discrete distributions. Shannon entropy assigns amount of uncertainity to an probability distribution.

We can, and do, apply similar formulas for continuous distribution but the interpretations don't remain same, and some properties are lost. This is called Differential Entropy. E.g. an event with probability = 1 has zero information because it is guranteed to occur, and similarly an event with density = 1 has zero information although it is not guranteed to occur.

Misc:

1. Mutual Information

Difference between the entropy of a variable and its conditional entropy:

\begin{align*} I[X;Y] = H[X] - H[X|Y] \end{align*}

Conceptually, it gives the average reduction in uncertainity about one variable when we know the value of another variable.

This quantity is summetric, i.e. \(I[X;Y] = I[Y;X]\)


Backlinks


Found this interesting? Subscribe to new posts.
Any comments? Send an email.