Redundancy

A source can be compressed only when it has redundancy which is the amount of “useless information” that can be eliminated via fixed length compression codes. >[!def] Redundancy >For a stationary ergodic source, its total redundancy ρT\rho_{T} is defined as: ρT=log2XH(X)\rho_T=\log_2|\mathcal{X}|-H(\mathcal{X})where log2X\log_2|\mathcal{X}| is the compression rate of UD code and H(X)H(\mathcal{X}) is the rate for asymptotically lossless code.

Proposition (2 types of redundancies)

ρT=ρD+ρM\rho_T=\rho_D+\rho_M

where: >[!def] Intra-symbol source redundancy >ρD\rho_D is the redundancy due to source’s non-uniform marginal pmf ρD=log2XH(X1)\rho_D=\log_2|\mathcal{X}|-H(X_1)this looks at the redundancy for each RV isolated from others (i.e. some letters in the alphabet are just more common than others independent of their ordering in words or sentences)

Definition (Inter-symbol source redundancy)

ρM\rho_M is the redundancy due to source memory: ρM=H(X1)H(X)\rho_M=H(X_{1})-H(\mathcal{X})this looks at redundancy when comparing between RVs in sequence (i.e. the letter ‘q’ is often followed by ‘u’ in the English language so this introduces some redundancy).

Proposition ((Redundancies for different stationary ergodic sources))

\mboxSourceρDρMρT=ρD+ρM\mboxiiduniform000\mboxiidnonuniformlogXH(X1)0ρD\mboxMCsymmetric(i.e.uniformMC)0H(X1)H(X2X1)ρM\mboxMCnonsymmetric(i.e.nonuniformMC)logXH(X1)H(X1)H(X2X1)ρD+ρM \begin{array} {|r|r|}\hline \mbox{Source} & \rho_{D} & \rho_{M} & \rho_{T}=\rho_D+\rho_M\\ \hline \mbox{iid uniform} & 0 & 0 & 0 \\ \hline \mbox{iid non-uniform} & \log|\mathcal{X}|-H(X_{1}) & 0 & \rho_{D} \\ \hline \mbox{MC symmetric (i.e. uniform MC)} & 0 & H(X_{1})-H(X_{2}|X_{1}) & \rho_{M} \\ \hline \mbox{MC non-symmetric (i.e. non-uniform MC)} & \log|\mathcal{X}|-H(X_{1}) & H(X_{1})-H(X_{2}|X_{1}) &\rho_D+\rho_M \\ \hline \end{array}

Linked from