Recall
In our closed-loop linear predictor we defined the error as en=Xn−i=1∑maiXn−iWe then have that E[enXn−j]=E[(Xn−i=1∑maiXn−i)Xn−j]=E[XnXn−j]−k=1∑maiE[Xn−kXn−j]We see that from (*), a1,…,am is optimal if and only if E[enXn−j]=0, j=1,…,m.
Theorem (Orthogonality Principle)
The linear predictor X^n=∑k=1makXn−k is optimal in the MSE sense if and only if the prediction error is orthogonal to all Xn−j i.e.X^n optimal ⟺(Xn−X^n)⊥Xn−j j=1,…,m