Conditional Entropy - Chain Rule

Chain Rule

Assume that the combined system determined by two random variables X and Y has entropy, that is, we need bits of information to describe its exact state. Now if we first learn the value of, we have gained bits of information. Once is known, we only need bits to describe the state of the whole system. This quantity is exactly, which gives the chain rule of conditional probability:

Formally, the chain rule indeed follows from the above definition of conditional probability:

\begin{align}
H(Y|X)=&\sum_{x\in\mathcal X, y\in\mathcal Y}p(x,y)\log \frac {p(x)} {p(x,y)}\\ =&-\sum_{x\in\mathcal X, y\in\mathcal Y}p(x,y)\log\,p(x,y) + \sum_{x\in\mathcal X, y\in\mathcal Y}p(x,y)\log\,p(x) \\
=& H(X,Y) + \sum_{x \in \mathcal X} p(x)\log\,p(x) \\
=& H(X,Y) - H(X).
\end{align}

Read more about this topic:  Conditional Entropy

Famous quotes containing the words chain and/or rule:

    Nae living man I’ll love again,
    Since that my lovely knight is slain.
    Wi ae lock of his yellow hair
    I’ll chain my heart for evermair.
    —Unknown. The Lament of the Border Widow (l. 25–28)

    The principle of majority rule is the mildest form in which the force of numbers can be exercised. It is a pacific substitute for civil war in which the opposing armies are counted and the victory is awarded to the larger before any blood is shed. Except in the sacred tests of democracy and in the incantations of the orators, we hardly take the trouble to pretend that the rule of the majority is not at bottom a rule of force.
    Walter Lippmann (1889–1974)