FIND ME ON

GitHub

LinkedIn

Huffman Lemma

🌱

Theorem

Consider a source with alphabet X={a1,⋯ ,aM}\mathcal{X}=\{a_{1},\cdots,a_{M}\} and symbol probabilities p1,⋯ ,pMp_{1},\cdots,p_{M} such that p1≄⋯≄pM>0p_1\ge\cdots\ge p_M>0Consider the reduced source with alphabet Y\mathcal{Y} obtained from X\mathcal{X} by combining the two least likely source symbols aMāˆ’1a_{M-1} and aMa_{M} into an equivalent symbol aMāˆ’1,Ma_{M-1,M} with probability pMāˆ’1,Mp_{M-1,M}: Y={a1,⋯ ,aMāˆ’2,aMāˆ’1,M}\mathcal{Y}=\{a_{1},\cdots,a_{M-2},a_{M-1,M}\}Let C2\mathcal{C}_{2}, given by f2:Y→{0,1}āˆ—f_{2}:\mathcal{Y}\to\{0,1\}^{*}, be an optimal prefix code for the reduced source Y\mathcal{Y}. Now, construct a code C1,Ā f1:X→{0,1}āˆ—\mathcal{C}_{1}, \ f_{1}:\mathcal{X}\to\{0,1\}^{*}, for the original source X\mathcal{X} as follows: - The codewords for symbols a1,⋯ ,aMāˆ’2a_{1},\cdots,a_{M-2} are exactly the same as the corresponding codewords in C2\mathcal{C}_{2}:f1(ai)=f2(ai),Ā i=1,⋯ ,Māˆ’2f_{1}(a_{i})=f_{2}(a_{i}), \ i=1,\cdots,M-2 - The codewords for symbols aMāˆ’1a_{M-1} and aMa_{M} are formed by appending a ā€œ0ā€ and a ā€œ1ā€, respectively, to the codeword f2(aMāˆ’1,M)∈C2f_{2}(a_{M-1,M})\in\mathcal{C}_{2} for symbol aMāˆ’1,Ma_{M-1,M}:f1(aMāˆ’1)=f2(aMāˆ’1,M)0f1(aM)=f2(aMāˆ’1,M)1\begin{align*} f_{1}(a_{M-1})&=f_{2}(a_{M-1,M})0\\ f_{1}(a_{M})&=f_{2}(a_{M-1,M})1\\ \end{align*}Then C1\mathcal{C}_{1} is an optimal prefix code for the original source X\mathcal{X}.