Asymptotic Equipartition Property

Definition (Typical Set)

For a DMS {Xi}i=1\{X_i\}_{i=1}^\infty with pmf pXp_X on X\mathcal{X} and entropy H(X)H(X), given integer n1n\ge 1 and ϵ>0\epsilon>0, the typical set Aϵ(n)A_\epsilon^{(n)} with respect to the source is Aϵ(n)={anXn:1nlog2(pXn(an))H(X)ϵ}={anXn:2n(H(X)+ϵ)pXn(an)2n(H(X)ϵ)}\begin{align*} A_\epsilon^{(n)}&=\left\{a^{n}\in X^{n}:\left|-\frac{1}{n}\log_2(p_{X^{n}}(a^{n}))-H(X)\right|\le\epsilon\right\}\\\\ &=\left\{a^{n}\in X^{n}:2^{-n(H(X)+\epsilon)}\le p_{X^{n}}(a^{n})\le2^{-n(H(X)-\epsilon)}\right\} \end{align*} or the typical set is the set of all n-tuples an=(a1,,an)a^n=(a_1,\cdots,a_n) drawn iid from X\mathcal{X} according to pXp_X whose probability is roughly 2nH(X)2^{-nH(X)}; or whose “apparent entropy” is within the “true entropy” H(X)H(X).

Theorem (Asymptotic Equipartition Property)

For a DMS {Xi}i=1\{X_i\}_{i=1}^\infty with pmf pXp_X and alphabet X\mathcal{X}, 1nlog2(pXn(Xn))nH(X)\mboxinprobability-\frac{1}{n}\log_2(p_{X^n}(X^n))\xrightarrow{n\to\infty}H(X)\mbox{ in probability}for the average information of Xi,n1X_i,n\ge1 converges in probability to the entropy of XX. For a stationary ergodic source {Xi}i=1\{X_i\}_{i=1}^\infty we have 1nlog2(pXn(Xn))nH(X)\mboxinprobability-\frac{1}{n}\log_2(p_{X^n}(X^n))\xrightarrow{n\to\infty}H(\mathcal{X})\mbox{ in probability}

Remark

AEP states that a subset Bϵ(n)B_{\epsilon}^{(n)} of Xn\mathcal{X}^{n} given by Bϵ(n)={anXn:1nlog2pXn(an)H(X)>ϵ}B_{\epsilon}^{(n)}=\{a^{n}\in\mathcal{X}^{n}:|- \frac{1}{n}\log_{2}p_{X^{n}}(a^{n})-H(X)|>\epsilon\} satisfies limnP(Bϵ(n))=0\lim_{n\to\infty}P(B_{\epsilon}^{(n)})=0

Theorem (Consequence of AEP)

For a DMS {Xi}i=1\{X_i\}_{i=1}^\infty with pmf pXp_X on X\mathcal{X} and entropy H(X)H(X), the typical set Aϵ(n)A_\epsilon^{(n)} satisfies:

  1. limnP(Aϵ(n))=1\lim_{n\to\infty}P(A_\epsilon^{(n)})=1
  2. Aϵ(n)2n(H(X)+ϵ)|A_\epsilon^{(n)}|\le2^{n(H(X)+\epsilon)}
  3. Aϵ(n)(1ϵ)2n(H(X)ϵ)|A_\epsilon^{(n)}|\ge(1-\epsilon)2^{n(H(X)-\epsilon)} for nn sufficiently large

Definition (Joint typical set)

Let {(Xi,Yi)}i=1\{(X_{i},Y_{i})\}^\infty_{i=1} be a |DMS with common pmf pXYp_{XY} on X×Y\mathcal{X}\times\mathcal{Y}. Given δ>0\delta>0 and integer n1n\ge1, the jointly typical set Aδ(n)A_\delta^{(n)} (or jointly δ\delta -typical set) with respect to the source is Aδ(n)={(xn,yn)Xn×Yn:1nlog2(pXn(xn))H(X)δ,1nlog2(pYn(yn))H(Y)δ,1nlog2(pXnYn(xn,yn))H(X,Y)δ}\begin{align*} A_{\delta}^{(n)}=\left\{(x^{n},y^{n})\in \mathcal{X}^{n}\times\mathcal{Y}^{n}:\left|-\frac{1}{n}\log_{2}(p_{X^{n}}(x^{n}))-H(X)\right|\le\delta,\right.\\ \left|-\frac{1}{n}\log_{2}(p_{Y^{n}}(y^{n}))-H(Y)\right|\le\delta,\\ \left.\left|-\frac{1}{n}\log_{2}(p_{X^{n}Y^{n}}(x^{n},y^{n}))-H(X,Y)\right|\le\delta \right\}\\ \end{align*}

Theorem (Joint Asymptotic Equipartition Property)

For a DMS {(Xi,Yi)}i=1\{(X_{i},Y_{i})\}^\infty_{i=1} with pmf pXYp_{XY} on X×Y\mathcal{X}\times\mathcal{Y}, then the joint typical set Aδ(n)A_{\delta}^{(n)} defined with respect to source pXYp_{XY} satisfies:

  1. PXnYn(Aδ(n))=P((Xn,Yn)Aδ(n))>1δP_{X^{n}Y^{n}}(A_\delta^{(n)})=P((X^{n},Y^{n})\in A_{\delta}^{(n)})>1-\delta for nn sufficiently large
  2. Aδ(n)2n(H(X,Y)+δ)|A_\delta^{(n)}|\le2^{n(H(X,Y)+\delta)}
  3. Aδ(n)(1δ)2n(H(X,Y)δ)|A_\delta^{(n)}|\ge(1-\delta)2^{n(H(X,Y)-\delta)} for nn sufficiently large

Definition (Typical set)

Fix ϵ>0\epsilon>0 and nn, the typical set Aϵ(n)A_{\epsilon}^{(n)} for a CMS with pdf fXf_{X} is given by Aϵ(n)={xnSXn:1nlog2fXn(Xn)h(X)ϵ}A_{\epsilon}^{(n)}=\{x^{n}\in S_{X^{n}}:|- \frac{1}{n}\log_{2}f_{X^{n}}(X^{n})-h(X)|\le\epsilon\}

Theorem (AEP For Continuous IID RVs)

Let {Xi}i=1\{X_{i}\}_{i=1}^{\infty} be a memoryless real-valued source, with common pdf fXf_{X} with support SXRS_{X}\subset \mathbb{R}. Then 1nlog2fXn(X1,,Xn)nE[log2fX(X)]=h(X)- \frac{1}{n}\log_{2}f_{X^{n}}(X_{1},\ldots,X_{n})\stackrel{ n\to\infty}\to E\left[-\log_{2}f_{X}(X)\right]=h(X)in probability.

Theorem (Consequence of the continuous AEP)

For the memoryless continuous source {Xi}i=1\{X_{i}\}^\infty_{i=1} with pdf fXf_{X} defined on SXRS_{X}\subset\mathbb{R} and differential entropy h(X)h(X), the following hold:

  1. limnPXn(Aϵ(n))=1, \mboxi.e.PXn(Aϵ(n))>1ϵ \mboxforn\mboxsufficientlylarge\lim_{n\to\infty}P_{X^{n}}\left(A_{\epsilon}^{(n)}\right)=1,\ \mbox{ i.e. }P_{X^{n}}\left(A_{\epsilon}^{(n)}\right)>1-\epsilon \ \mbox{ for }n\mbox{ sufficiently large}
  2. vol(Aϵ(n))2n(h(X))+ϵ nvol(A_{\epsilon}^{(n)})\le 2^{n(h(X))+\epsilon} \ \forall n
  3. vol(Aϵ(n)>(1ϵ)2n(h(X)ϵ)) \mboxforn\mboxsufficientlylargevol(A_{\epsilon}^{(n)}>(1-\epsilon)2^{n(h(X)-\epsilon)}) \ \mbox{for }n\mbox{ sufficiently large}

Linked from