Let X be a rv s.t. XāL2 and let R be a positive definite matrix. The following holds: gāM(Y)infāE[(Xāg(Y))ā¤R(Xāg(Y))]=E[(XāE[Xā£Y])ā¤R(XāE[Xā£Y])]where M(Y) denotes the set of measurable functions from Y to R and where G(y)=E[Xā£Y=y] a.s..
\begin{proof} Not a fan of how they explain this but it suffices for this proof to have G(y)=E[Xā£Y=y]+h(y) and show that to minimize our expression it is necessary for h(y)=0. So: āE[(XāE[Xā£Y]āh(Y))ā¤R(XāE[Xā£Y]āh(Y))]=E[(XāE[Xā£Y])ā¤R(XāE[Xā£Y])]+E[hā¤(Y)Rh(Y)]+2E[(XāE[Xā£Y])ā¤Rh(Y)]=E[(XāE[Xā£Y])ā¤R(XāE[Xā£Y])]+E[hā¤(Y)Rh(Y)]+2E[E[(XāE[Xā£Y])ā¤Rh(Y)]ā£Y]=E[(XāE[Xā£Y])ā¤R(XāE[Xā£Y])]+E[hā¤(Y)Rh(Y)]+2E[E[(XāE[Xā£Y])ā¤ā£Y]Rh(Y)]=E[(XāE[Xā£Y])ā¤R(XāE[Xā£Y])]+E[hā¤(Y)Rh(Y)]ā„E[(XāE[Xā£Y])ā¤R(XāE[Xā£Y])]ā So in order we: 1. Multiply out 2. Apply Conditional Expectation 3. Apply Conditional Expectation 4. Make note that the new eq has Orthogonal rvs and thus 0 expectation 5. Achieve inequality. \end{proof}
Now we consider the following system: xt+1āytāā=Axtā+Butā+wtā=Cxtā+vtāāwtāā¼N(0,W)vtāā¼N(0,V)āwhere xāRn,uāRm,wāRn,yāRp,vāRp. The goal is to find the optimal cost γāĪinfāJ(γ,μ0ā)where the cost equation is the quadratic cost of the state and the action J(μ0ā,γ)=Eμ0āγā[t=0āNā1āxtā¤āQxtā+utā¤āRutā+xNā¤āQNāxNā]with R positive definite and Q,QNā positive semidefinite, and μ0ā the prior on the state which is assumed to be zero-mean Gaussian.
Control-Free Setup
We consider the control-free setup here.
Let X,Y be zero-mean Gaussian vectors. Then, 1. E[Xā£Y=y] is linear in y: E[Xā£Y=y]=Ī£XYāĪ£YYā1āyand; 2. We have that E[(XāE[Xā£Y])(XāE[Xā£Y])ā¤]=Ī£XXāāĪ£XYāĪ£YYā1āĪ£XYā¤ā=:D
In particular, E[(XāE[Xā£Y])(XāE[Xā£Y])ā¤ā£Y=y] does not depend on the realization y of Y and is equal to D.
\begin{proof} (X,Y) are Gaussian processes and admit densities p(xā£y)=p(y)p(x,y)ā. With KXYā:=E[[XYā][Xā¤āYā¤ā]] we have that KXYā:=[Ī£XXāĪ£YXāāĪ£XYāĪ£YYāā],KXYā1ā=[ĪØXXāĪØYXāāĪØXYāĪØYYāā]Then, p(x,y)=ā£Ī£XYāā£(2Ļ)2n+mā1āeā21ā([XYā]ā¤KXYā1ā[XYā])then we have p(xā£y)ā=(2Ļ)2n+māā£Ī£XYāā£1āeā21ā([xā¤āyā¤ā]KXYā1ā[xyā])ā
((2Ļ)2māā£KYYāā£1āeā21āyā¤KYYā1āy)ā1=Ceā21āyā¤KYYā1āyeā21ā([xā¤āyā¤ā][ĪØXXāĪØYXāāĪØXYāĪØYYāā][xyā])ā=Ceā21āyā¤KYYā1āyeā21ā(xā¤ĪØXXāx+2xā¤ĪØXYāy+yā¤ĪØYYāy)ā=Ceā21ā(xā¤ĪØXXāx+2xā¤ĪØXYāy+yā¤ĪØYYāyāyā¤KYYā1āy)āNow looking at the expression in the exponent we can apply a completion of squares argument: āxā¤ĪØXXāx+2xā¤ĪØXYāy+yā¤ĪØYYāyāyā¤KYYā1āy=(x+ĪØXXā1āĪØXYāy)ā¤ĪØXXā(x+ĪØXXā1āĪØXYāy)+Q(y)=(xāHy)ā¤Dā1(xāHy)+Q(y)āThen, we observe the following: [ĪØXXāĪØYXāāĪØXYāĪØYYāā]ā
[Ī£XXāĪ£YXāāĪ£XYāĪ£YYāā]=[I0ā0Iā]which gives us that ĪØXXāĪ£XYā+ĪØXYāĪ£YYā=0 therefore ā¹Ī£XYā=āĪØXXā1āĪØXYāĪ£YYāā¹Ī£XYāĪ£YYā1ā=āĪØXXā1āĪØXYāāallowing us to re-express H and leave us with the resultant conditional density p(xā£y)=Ceā21āQ(y)eā21ā(xāĪ£XYāĪ£YYā1āy)ā¤ĪØXXā(xāĪ£XYāĪ£YYā1āy)Giving us the first condition.
Finally, since ā«p(xā£y)dx=1 we necessarily have that Ceā21āQ(y)=(2Ļ)2nāā£Dā£21ā1āwhich is in fact independent of y. Then, we finally have that D which does not depend on y is E[(XāE[Xā£Y])(XāE[Xā£Y])ā¤ā£Y=y]=E[(XāE[Xā£Y])(XāE[Xā£Y])ā¤]
\end{proof}
To derive the Kalman Filter the following two lemmas are required: >[!lem|6.2.3] >If E[X]=0 and Z1ā,Z2ā are orthogonal, zero-mean Gaussian Processes (with E[Z1ā¤āZ2ā]=0), then E[Xā£Z1ā=z1ā,Z2ā=z2ā]=E[Xā£Z1ā=z1ā]+E[Xā£Z2ā=z2ā]
\begin{proof} āE[Xā£(Z1ā,Z2ā)=(z1ā,z2ā)]=E[X[Z1āZ2āā]ā¤]E[[Z1āZ2āā][Z1āZ2āā]ā¤]ā1[z1āz2āā]=E[XZ1ā¤ā]E[XZ2ā¤ā][E[Z1ā¤āZ1ā]E[Z2ā¤āZ1ā]āE[Z1ā¤āZ2ā]E[Z2ā¤āZ2ā]ā]ā1[z1āz2āā]=E[XZ1ā¤ā](E[Z1āZ1ā¤ā])ā1z1ā+E[XZ2ā¤ā](E[Z2āZ2ā¤ā])ā1z2ā=E[Xā£Z1ā=z1ā]+E[Xā£Z2ā=z2ā]ā where the third equality is due to orthogonality and the final one is due to \end{proof}
and
E[(XāE[Xā£Y])(XāE[Xā£Y])ā¤] is given by D from above
\begin{proof} First note that E[X(E[Xā£Y])ā¤]=E[(XāE[Xā£Y]+E[Xā£Y])(E[Xā£Y]ā¤)]=E[E[Xā£Y](E[Xā£Y])ā¤]since XāE[Xā£Y] is orthogonal to E[Xā£Y] (which we know by another Conditional Expectation argument). Then E[(XāE[Xā£Y])(XāE[Xā£Y])ā¤]ā=E[XXā¤]ā2E[X(E[Xā£Y])ā¤]+E[E[Xā£Y]E[Xā£Y]ā¤]=E[XXā¤]āE[E[Xā£Y]E[Xā£Y]ā¤]=Ī£XXāāE[Ī£XYāĪ£YYā1āyyā¤(Ī£YYā1ā)ā¤Ī£XYā¤ā]=Ī£XXāāĪ£XYāĪ£YYā1āĪ£XYā1āā \end{proof}
Now, consider the following system without control: xt+1āytāā=Axtā+wtā=Cxtā+vtāāwtāā¼iidN(0,W)vtāā¼iidN(0,V)āwhere xāRn,uāRm,wāRn,yāRp,vāRp. Define the mean process, mtā, and covariance process, Ī£tā£tā1ā as mtāĪ£tā£tā1āā=E[xtāā£y0:tā1ā]=E[(xtāāE[xtāā£y0:tā1ā])(xtāāE[xtāā£y0:tā1ā])ā¤ā£y0:tā1ā]āand since the estimation error covariance does not depend on the realization y0:tā1ā we can rewrite it as Ī£tā£tā1ā=E[(xtāāE[xtāā£y0:tā1ā])(xtāāE[xtāā£y0:tā1ā])ā¤]. >[!thm|6.2.1] >The following holds: mt+1āĪ£t+1ā£tāā=Amtā+AĪ£tā£tā1āCā¤(CĪ£tā£tā1āCā¤+V)ā1(ytāāCmtā)=AĪ£tā£tā1āAā¤+Wā(AĪ£tā£tā1āCā¤)(CĪ£tā£tā1āCā¤+V)ā1(CĪ£tā£tā1āAā¤)āwith m0ā=E[x0ā],Ī£0ā£ā1ā=E[x0āx0ā¤ā].
\begin{proof} mt+1āā=E[Axtā+wtāā£y0:tā]=E[Axtāā£y0:tā]=E[Amtā+A(xtāāmtā)ā£y0:tā]=Amtā+E[A(xtāāmtā)ā£y0:tā1ā,ytāāE[ytāā£y0:tā1ā]]=Amtā+E[A(xtāāmtā)ā£y0:tā1ā]+E[A(xtāāmtā)ā£ytāāE[ytāā£y0:tā1ā]]=Amtā+E[A(xtāāmtā)ā£ytāāE[ytāā£y0:tā1ā]]=Amtā+E[A(xtāāmtā)ā£Cxtā+vtāāE[Cxtā+vtāā£y0:tā1ā]]=Amtā+E[A(xtāāmtā)ā£C(xtāāmtā)+vtā]ābyĀ lemmaĀ 6.2.3ā Let X=A(xtāāmtā) and Y=E[ytāā£y0:tā1ā]=ytāāCxtā=C(xtāāmtā)+vtā. Then, by we have E[Xā£Y]=Ī£XYāĪ£YYā1āYand thus,mt+1āā=Amtā+AE[(xtāāmtā)(xtāāmtā)ā¤]Cā¤(E[(C(xtāāmtā)+vtā)(C(xtāāmtā)+vtā)ā¤])ā1(ytāāCxtā)=Amtā+AĪ£tā£tā1āCā¤(CE[(xtāāmtā)(xtāāmtā)ā¤]Cā¤+E[vtāvtā¤ā])ā1(ytāāCxtā)=Amtā+AĪ£tā£tā1āCā¤(CĪ£tā£tā1āCā¤+V)ā1(ytāāCxtā)ā Likewise, xt+1āāmt+1āĪ£t+1ā£tāā=A(xtāāmtā)+wtāāAĪ£tā£tā1āCā¤(CĪ£tā£tā1āCā¤+V)ā1(ytāāCxtā)=ā¦=AĪ£tā£tā1āAā¤+Wā(AĪ£tā£tā1āCā¤)(CĪ£tā£tā1āCā¤+V)ā1(CĪ£tā£tā1āAā¤)ā
\end{proof} Define now m~tā=E[xtāā£y0:tā]=mtā+E[xtāāmtāā£y0:tā].Following the analysis above we obtain m~tā=mtā+E[xtāāmtāā£y0:tā1ā]+E[xtāāmtāā£ytāāE[ytāā£y0:tā1ā]]Note that we also have mtā=Am~tā1ā. The following then results:
The recursions for m~tā satisfy m~tā=Am~tā1ā+Ī£tā£tā1āCā¤(CĪ£tā£tā1āCā¤+V)ā1(ytāāCAm~tā1ā)with m~0ā=E[x0āā£y0ā].