Construction
In SFE coding, we order our alphabet X={a1,…,am} such that a1<⋯<am.
We induce a lexicographical ordering (i.e. think alphabetical order like in a dictionary) on Xn for n≥2: yn=(y1,..,yn)<(x1,…,xn)=xn⟺y1<x1or ∃k∈{2,…,n} s.t. yi=xi, i=1,..,k−1 and yk<xk and so on.
Let p(xn) be a pmf on Xn such that F^(xn)=yn<xn∑p(yn), F(xn)=yn≤xn∑p(yn)The tag of xn is defined as Tˉ(xn)=F^(xn)+21p(xn)i.e. the midpoint of the interval [F^(xn),F(xn)).
The tag of a source word xn∈Xn is defined as Tˉ(xn)=F^(xn)+21p(xn) where F^(xn)=yn<xn∑p(yn)
The Shannon-Fano-Elias Code C on Xn is defined by C(xn)=bl(xn)(Tˉ(xn))where l(xn)=⌈−logp(xn)⌉+1That is, the codeword associated with xn is the binary representation of the tag of xn, Tˉ(xn) truncated to ⌈−logp(xn)⌉+1 bits.
Intuition
So by the way we define the tag we’re defining it in a way that for each xn in the set of messages, their tag will be inside of a disjoint interval, [F^(xn),F(xn)) from all others xns.
The is a prefix code with average code length lˉ(Cxn)=E[l(Cxn)]<H(Xn)+2or average code rate Rˉn(C)≤n1H(Xn)+n2 and if (Xn)n∈N is stationary then n→∞limRˉn(C)=Hˉ(X)