Lemma (6.2.1)
Let X be a rv s.t. X∈L2 and let R be a Positive Definite matrix. The following holds: g∈M(Y)infE[(X−g(Y))⊤R(X−g(Y))]=E[(X−E[X∣Y])⊤R(X−E[X∣Y])]where M(Y) denotes the set of measurable functions from Y to R and where G(y)=E[X∣Y=y] a.s..
\begin{proof} Not a fan of how they explain this but it suffices for this proof to have G(y)=E[X∣Y=y]+h(y) and show that to minimize our expression it is necessary for h(y)=0. So: E[(X−E[X∣Y]−h(Y))⊤R(X−E[X∣Y]−h(Y))]=E[(X−E[X∣Y])⊤R(X−E[X∣Y])]+E[h⊤(Y)Rh(Y)]+2E[(X−E[X∣Y])⊤Rh(Y)]=E[(X−E[X∣Y])⊤R(X−E[X∣Y])]+E[h⊤(Y)Rh(Y)]+2E[E[(X−E[X∣Y])⊤Rh(Y)]∣Y]=E[(X−E[X∣Y])⊤R(X−E[X∣Y])]+E[h⊤(Y)Rh(Y)]+2E[E[(X−E[X∣Y])⊤∣Y]Rh(Y)]=E[(X−E[X∣Y])⊤R(X−E[X∣Y])]+E[h⊤(Y)Rh(Y)]≥E[(X−E[X∣Y])⊤R(X−E[X∣Y])] So in order we:
- Multiply out
- Apply Iterated Expectation
- Apply Conditional Expectation
- Make note that the new eq has Orthogonal random variables rvs and thus 0 expectation
- Achieve inequality.
\end{proof}
Now we consider the following system: xt+1yt=Axt+But+wt=Cxt+vtwt∼N(0,W)vt∼N(0,V)where x∈Rn,u∈Rm,w∈Rn,y∈Rp,v∈Rp. The goal is to find the optimal cost γ∈ΓinfJ(γ,μ0)where the cost equation is the quadratic cost of the state and the action J(μ0,γ)=Eμ0γ[t=0∑N−1xt⊤Qxt+ut⊤Rut+xN⊤QNxN]with R Positive Definite and Q,QN Positive Semidefinite, and μ0 the prior on the state which is assumed to be zero-mean Gaussian.
Control-Free Setup
We consider the control-free setup here.
Lemma (6.2.2)
Let X,Y be zero-mean Gaussian vectors. Then,
- E[X∣Y=y] is linear in y: E[X∣Y=y]=ΣXYΣYY−1yand;
- We have that E[(X−E[X∣Y])(X−E[X∣Y])⊤]=ΣXX−ΣXYΣYY−1ΣXY⊤=:D
In particular, E[(X−E[X∣Y])(X−E[X∣Y])⊤∣Y=y] does not depend on the realization y of Y and is equal to D.
\begin{proof} (X,Y) are Gaussian processes and admit densities p(x∣y)=p(y)p(x,y). With KXY:=E[[XY][X⊤Y⊤]] we have that KXY:=[ΣXXΣYXΣXYΣYY],KXY−1=[ΨXXΨYXΨXYΨYY]Then, p(x,y)=∣ΣXY∣(2π)2n+m1e−21([XY]⊤KXY−1[XY])then we have p(x∣y)=(2π)2n+m∣ΣXY∣1e−21([x⊤y⊤]KXY−1[xy])⋅((2π)2m∣KYY∣1e−21y⊤KYY−1y)−1=Ce−21y⊤KYY−1ye−21([x⊤y⊤][ΨXXΨYXΨXYΨYY][xy])=Ce−21y⊤KYY−1ye−21(x⊤ΨXXx+2x⊤ΨXYy+y⊤ΨYYy)=Ce−21(x⊤ΨXXx+2x⊤ΨXYy+y⊤ΨYYy−y⊤KYY−1y)Now looking at the expression in the exponent we can apply a Completion of Squares argument: x⊤ΨXXx+2x⊤ΨXYy+y⊤ΨYYy−y⊤KYY−1y=(x+ΨXX−1ΨXYy)⊤ΨXX(x+ΨXX−1ΨXYy)+Q(y)=(x−Hy)⊤D−1(x−Hy)+Q(y)Then, we observe the following: [ΨXXΨYXΨXYΨYY]⋅[ΣXXΣYXΣXYΣYY]=[I00I]which gives us that ΨXXΣXY+ΨXYΣYY=0 therefore ⟹ΣXY=−ΨXX−1ΨXYΣYY⟹ΣXYΣYY−1=−ΨXX−1ΨXYallowing us to re-express H and leave us with the resultant conditional density p(x∣y)=Ce−21Q(y)e−21(x−ΣXYΣYY−1y)⊤ΨXX(x−ΣXYΣYY−1y)Giving us the first condition.
Finally, since ∫p(x∣y)dx=1 we necessarily have that Ce−21Q(y)=(2π)2n∣D∣211which is in fact independent of y. Then, we finally have that D which does not depend on y is E[(X−E[X∣Y])(X−E[X∣Y])⊤∣Y=y]=E[(X−E[X∣Y])(X−E[X∣Y])⊤]
\end{proof}
To derive the Kalman Filter the following two lemmas are required: >[!lem|6.2.3] >If E[X]=0 and Z1,Z2 are orthogonal, zero-mean Gaussian Processes (with E[Z1⊤Z2]=0), then E[X∣Z1=z1,Z2=z2]=E[X∣Z1=z1]+E[X∣Z2=z2]
\begin{proof} E[X∣(Z1,Z2)=(z1,z2)]=E[X[Z1Z2]⊤]E[[Z1Z2][Z1Z2]⊤]−1[z1z2]=E[XZ1⊤]E[XZ2⊤][E[Z1⊤Z1]E[Z2⊤Z1]E[Z1⊤Z2]E[Z2⊤Z2]]−1[z1z2]=E[XZ1⊤](E[Z1Z1⊤])−1z1+E[XZ2⊤](E[Z2Z2⊤])−1z2=E[X∣Z1=z1]+E[X∣Z2=z2] where the third equality is due to orthogonality and the final one is due to \end{proof}
and
Lemma (6.2.4)
E[(X−E[X∣Y])(X−E[X∣Y])⊤] is given by D from above
\begin{proof} First note that E[X(E[X∣Y])⊤]=E[(X−E[X∣Y]+E[X∣Y])(E[X∣Y]⊤)]=E[E[X∣Y](E[X∣Y])⊤]since X−E[X∣Y] is orthogonal to E[X∣Y] (which we know by another Iterated Expectation argument). Then E[(X−E[X∣Y])(X−E[X∣Y])⊤]=E[XX⊤]−2E[X(E[X∣Y])⊤]+E[E[X∣Y]E[X∣Y]⊤]=E[XX⊤]−E[E[X∣Y]E[X∣Y]⊤]=ΣXX−E[ΣXYΣYY−1yy⊤(ΣYY−1)⊤ΣXY⊤]=ΣXX−ΣXYΣYY−1ΣXY−1 \end{proof}
Now, consider the following system without control: xt+1yt=Axt+wt=Cxt+vtwt∼iidN(0,W)vt∼iidN(0,V)where x∈Rn,u∈Rm,w∈Rn,y∈Rp,v∈Rp. Define the mean process, mt, and covariance process, Σt∣t−1 as mtΣt∣t−1=E[xt∣y0:t−1]=E[(xt−E[xt∣y0:t−1])(xt−E[xt∣y0:t−1])⊤∣y0:t−1]and since the estimation error covariance does not depend on the realization y0:t−1 we can rewrite it as Σt∣t−1=E[(xt−E[xt∣y0:t−1])(xt−E[xt∣y0:t−1])⊤]. >[!thm|6.2.1] >The following holds: mt+1Σt+1∣t=Amt+AΣt∣t−1C⊤(CΣt∣t−1C⊤+V)−1(yt−Cmt)=AΣt∣t−1A⊤+W−(AΣt∣t−1C⊤)(CΣt∣t−1C⊤+V)−1(CΣt∣t−1A⊤)with m0=E[x0],Σ0∣−1=E[x0x0⊤].
\begin{proof} mt+1=E[Axt+wt∣y0:t]=E[Axt∣y0:t]=E[Amt+A(xt−mt)∣y0:t]=Amt+E[A(xt−mt)∣y0:t−1,yt−E[yt∣y0:t−1]]=Amt+E[A(xt−mt)∣y0:t−1]+E[A(xt−mt)∣yt−E[yt∣y0:t−1]]=Amt+E[A(xt−mt)∣yt−E[yt∣y0:t−1]]=Amt+E[A(xt−mt)∣Cxt+vt−E[Cxt+vt∣y0:t−1]]=Amt+E[A(xt−mt)∣C(xt−mt)+vt]by lemma 6.2.3 Let X=A(xt−mt) and Y=E[yt∣y0:t−1]=yt−Cxt=C(xt−mt)+vt. Then, by we have E[X∣Y]=ΣXYΣYY−1Yand thus,mt+1=Amt+AE[(xt−mt)(xt−mt)⊤]C⊤(E[(C(xt−mt)+vt)(C(xt−mt)+vt)⊤])−1(yt−Cxt)=Amt+AΣt∣t−1C⊤(CE[(xt−mt)(xt−mt)⊤]C⊤+E[vtvt⊤])−1(yt−Cxt)=Amt+AΣt∣t−1C⊤(CΣt∣t−1C⊤+V)−1(yt−Cxt) Likewise, xt+1−mt+1Σt+1∣t=A(xt−mt)+wt−AΣt∣t−1C⊤(CΣt∣t−1C⊤+V)−1(yt−Cxt)=…=AΣt∣t−1A⊤+W−(AΣt∣t−1C⊤)(CΣt∣t−1C⊤+V)−1(CΣt∣t−1A⊤)
\end{proof} Define now m~t=E[xt∣y0:t]=mt+E[xt−mt∣y0:t].Following the analysis above we obtain m~t=mt+E[xt−mt∣y0:t−1]+E[xt−mt∣yt−E[yt∣y0:t−1]]Note that we also have mt=Am~t−1. The following then results:
Theorem (6.2.2)
The recursions for m~t satisfy m~t=Am~t−1+Σt∣t−1C⊤(CΣt∣t−1C⊤+V)−1(yt−CAm~t−1)with m~0=E[x0∣y0].