Let (X,Y) be a pair of jointly distributed discrete RVs taking values in X×Y (where both alphabets are finite) with joint pmf pxy(x,y):=P(X=a,Y=b), a∈X,b∈Y>Then the joint entropy of X and Y (denoted H(X,Y)) is given by H(X,Y)=−a∈X∑b∈Y∑pxy(x,y)log2pxy(x,y)=Epxy[−log2pxy(x,y)]
Let Xn:=(X1,⋯,Xn) be a random vector with each element taking values in a common alphabet X with joint pmf pXn then H(Xn):=H(X1,⋯,Hn)=H(X1)+H(X2∣X1)+H(X3∣X2,X1)+⋯+H(Xn∣Hn−1,⋯,H1)=i=1∑nH(Xi∣Xi−1) where H(Xi∣Xi−1):=H(X1) for i=1
Let Xn=(X1,⋯,Xn) be a random vector of n RVs with common alphabet X with joint pmf pXn then, for the joint entropy we have H(Xn)=H(X1,⋯,Hn)≤i=1∑nH(Xi) with equality iff Xi’s are independent.
For (X,Y,Z)∼pXYZ on X×Y×Z, the conditional joint entropy of X and Y given Z is H(X,Y∣Z)=EpXYZ[−log2(pXY∣Z(X,Y∣Z))]=−z∈X∑b∈Y∑c∈Z∑pXYZ(a,b,c)log2(pXY∣Z(a,b∣c))
If (X,Y,Z)∼pXYZ, then H(X,Y∣Z)=H(X∣Z)+H(Y∣X,Z)=H(Y∣Z)+H(X∣Y,Z)