Kalman Filter

Lemma (6.2.1)

Let XX be a rv s.t. XL2X\in L^{2} and let RR be a Positive Definite matrix. The following holds: infgM(Y)E[(Xg(Y))R(Xg(Y))]=E[(XE[XY])R(XE[XY])]\inf_{g\in \mathbb{M}(\mathbb{Y})}\mathbb{E}\left[ (X-g(Y))^{\top}R(X-g(Y)) \right]=\mathbb{E}\left[ (X-\mathbb{E}[X|Y])^{\top}R(X-\mathbb{E}[X|Y]) \right] where M(Y)\mathbb{M}(\mathbb{Y}) denotes the set of measurable functions from Y\mathbb{Y} to R\mathbb{R} and where G(y)=E[XY=y]G(y)=\mathbb{E}[X|Y=y] a.s..

\begin{proof} Not a fan of how they explain this but it suffices for this proof to have G(y)=E[XY=y]+h(y)G(y)=\mathbb{E}[X|Y=y]+h(y) and show that to minimize our expression it is necessary for h(y)=0h(y)=0. So: E[(XE[XY]h(Y))R(XE[XY]h(Y))]=E[(XE[XY])R(XE[XY])]+E[h(Y)Rh(Y)]+2E[(XE[XY])Rh(Y)]=E[(XE[XY])R(XE[XY])]+E[h(Y)Rh(Y)]+2E[E[(XE[XY])Rh(Y)]Y]=E[(XE[XY])R(XE[XY])]+E[h(Y)Rh(Y)]+2E[E[(XE[XY])Y]Rh(Y)]=E[(XE[XY])R(XE[XY])]+E[h(Y)Rh(Y)]E[(XE[XY])R(XE[XY])]\begin{align*} &\mathbb{E}[(X-\mathbb{E}[X|Y]-h(Y))^{\top}R(X-\mathbb{E}[X|Y]-h(Y))]\\ &= \mathbb{E}[(X-\mathbb{E}[X|Y])^{\top}R(X-\mathbb{E}[X|Y])]+\mathbb{E}[h^{\top}(Y)Rh(Y)]+2\mathbb{E}[(X-\mathbb{E}[X|Y])^{\top}Rh(Y)]\\ &= \mathbb{E}[(X-\mathbb{E}[X|Y])^{\top}R(X-\mathbb{E}[X|Y])]+\mathbb{E}[h^{\top}(Y)Rh(Y)]+2\mathbb{E}[\mathbb{E}[(X-\mathbb{E}[X|Y])^{\top}Rh(Y)]\mid Y]\\ &= \mathbb{E}[(X-\mathbb{E}[X|Y])^{\top}R(X-\mathbb{E}[X|Y])]+\mathbb{E}[h^{\top}(Y)Rh(Y)]+2\mathbb{E}\left[\mathbb{E}[(X-\mathbb{E}[X|Y])^{\top}\mid Y]Rh(Y)\right]\\ &= \mathbb{E}[(X-\mathbb{E}[X|Y])^{\top}R(X-\mathbb{E}[X|Y])]+\mathbb{E}[h^{\top}(Y)Rh(Y)]\\ &\ge \mathbb{E}[(X-\mathbb{E}[X|Y])^{\top}R(X-\mathbb{E}[X|Y])] \end{align*} So in order we:

  1. Multiply out
  2. Apply Iterated Expectation
  3. Apply Conditional Expectation
  4. Make note that the new eq has Orthogonal random variables rvs and thus 0 expectation
  5. Achieve inequality. \end{proof}

Now we consider the following system: xt+1=Axt+But+wtwtN(0,W)yt=Cxt+vtvtN(0,V)\begin{align*} x_{t+1}&= Ax_{t}+Bu_{t}+w_{t}&w_{t}\sim \mathcal{N}(0,W)\\ y_{t}&= Cx_{t}+v_{t}&v_{t}\sim \mathcal{N}(0,V) \end{align*}where xRn,uRm,wRn,yRp,vRpx\in \mathbb{R}^{n},u\in \mathbb{R}^{m},w\in \mathbb{R}^{n},y\in \mathbb{R}^{p},v\in \mathbb{R}^{p}. The goal is to find the optimal cost infγΓJ(γ,μ0)\inf_{\gamma \in\Gamma}J(\gamma,\mu_{0})where the cost equation is the quadratic cost of the state and the action J(μ0,γ)=Eμ0γ[t=0N1xtQxt+utRut+xNQNxN]J(\mu_{0},\gamma)=\mathbb{E}_{\mu_{0}}^{\gamma}\left[ \sum_{t=0}^{N-1}x_{t}^{\top}Qx_{t}+u_{t}^{\top}Ru_{t}+x_{N}^{\top}Q_{N}x_{N} \right] with RR Positive Definite and Q,QNQ,Q_{N} Positive Semidefinite, and μ0\mu_{0} the prior on the state which is assumed to be zero-mean Gaussian.


Control-Free Setup

We consider the control-free setup here.

Remark

A Gaussian measure with mean μ\mu and covariance matrix ΣXX\Sigma_{XX} has the following density p(x)=1(2π)n/2ΣXX1/2e12((xμ)ΣXX1(xμ))p(x)= \frac{1}{(2\pi)^{n/2}|\Sigma_{XX}|^{1/2}}e^{- \frac{1}{2}((x-\mu)^{\top}\Sigma_{XX}^{-1}(x-\mu))}

Lemma (6.2.2)

Let X,YX,Y be zero-mean Gaussian vectors. Then,

  1. E[XY=y]\mathbb{E}[X|Y=y] is linear in y: E[XY=y]=ΣXYΣYY1y\mathbb{E}[X|Y=y]=\Sigma_{XY}\Sigma_{YY}^{-1}yand;
  2. We have that E[(XE[XY])(XE[XY])]=ΣXXΣXYΣYY1ΣXY=:D\mathbb{E}[(X-\mathbb{E}[X|Y])(X-\mathbb{E}[X|Y])^{\top}]=\Sigma_{XX}-\Sigma_{XY}\Sigma_{YY}^{-1}\Sigma_{XY}^{\top}=:D

In particular, E[(XE[XY])(XE[XY])Y=y]\mathbb{E}[(X-\mathbb{E}[X|Y])(X-\mathbb{E}[X|Y])^{\top}\mid Y=y] does not depend on the realization yy of YY and is equal to DD.

\begin{proof} (X,Y)(X,Y) are Gaussian processes and admit densities p(xy)=p(x,y)p(y)p(x\mid y)= \frac{p(x,y)}{p(y)}. With KXY:=E[[XY][XY]]K_{XY}:=\mathbb{E}\left[\begin{bmatrix}X\\Y\end{bmatrix}\begin{bmatrix}X^{\top}&Y^{\top}\end{bmatrix}\right] we have that KXY:=[ΣXXΣXYΣYXΣYY],KXY1=[ΨXXΨXYΨYXΨYY]K_{XY}:=\begin{bmatrix}\Sigma_{XX} & \Sigma_{XY}\\\Sigma_{YX} & \Sigma_{YY}\end{bmatrix},\quad K_{XY}^{-1}=\begin{bmatrix}\Psi_{XX} & \Psi_{XY}\\\Psi_{YX} & \Psi_{YY}\end{bmatrix}Then, p(x,y)=1ΣXY(2π)n+m2e12([XY]KXY1[XY])p(x,y)=\frac{1}{|\Sigma_{XY}|(2\pi)^{\frac{n+m}{2}}}e^{-\frac{1}{2}\left(\begin{bmatrix}X\\Y\end{bmatrix}^{\top}K_{XY}^{-1}\begin{bmatrix}X\\Y\end{bmatrix}\right)}then we have p(xy)=1(2π)n+m2ΣXYe12([xy]KXY1[xy])(1(2π)m2KYYe12yKYY1y)1=Ce12([xy][ΨXXΨXYΨYXΨYY][xy])e12yKYY1y=Ce12(xΨXXx+2xΨXYy+yΨYYy)e12yKYY1y=Ce12(xΨXXx+2xΨXYy+yΨYYyyKYY1y)\begin{align*} p(x\mid y)&= \frac{1}{(2\pi)^{\frac{n+m}{2}}|\Sigma_{XY}|}e^{- \frac{1}{2}\left( \begin{bmatrix}x^{\top}&y^{\top}\end{bmatrix}K_{XY}^{-1}\begin{bmatrix}x\\y\end{bmatrix} \right) }\cdot \left( \frac{1}{(2\pi)^{\frac{m}{2}}|K_{YY}|}e^{-\frac{1}{2}y^{\top}K_{YY}^{-1}y} \right)^{-1} \\ &= C \frac{e^{-\frac{1}{2}\left( \begin{bmatrix}x^{\top}&y^{\top}\end{bmatrix}\begin{bmatrix}\Psi_{XX} & \Psi_{XY}\\\Psi_{YX} & \Psi_{YY}\end{bmatrix}\begin{bmatrix}x\\y\end{bmatrix} \right) }}{e^{-\frac{1}{2}y^{\top}K_{YY}^{-1}y}}\\ &= C \frac{e^{-\frac{1}{2}\left( x^{\top}\Psi_{XX}x+2x^{\top}\Psi_{XY}y+y^{\top}\Psi_{YY}y \right) }}{e^{-\frac{1}{2}y^{\top}K_{YY}^{-1}y}}\\ &= C e^{-\frac{1}{2}\left( x^{\top}\Psi_{XX}x+2x^{\top}\Psi_{XY}y+y^{\top}\Psi_{YY}y -y^{\top}K_{YY}^{-1}y\right) } \end{align*}Now looking at the expression in the exponent we can apply a Completion of Squares argument: xΨXXx+2xΨXYy+yΨYYyyKYY1y=(x+ΨXX1ΨXYy)ΨXX(x+ΨXX1ΨXYy)+Q(y)=(xHy)D1(xHy)+Q(y)\begin{align*} &x^{\top}\Psi_{XX}x+2x^{\top}\Psi_{XY}y+y^{\top}\Psi_{YY}y -y^{\top}K_{YY}^{-1}y\\ &= (x+\Psi_{XX}^{-1}\Psi_{XY}y)^{\top}\Psi_{XX}(x+\Psi_{XX}^{-1}\Psi_{XY}y)+Q(y)\\ &= (x-Hy)^{\top}D^{-1}(x-Hy)+Q(y) \end{align*}Then, we observe the following: [ΨXXΨXYΨYXΨYY][ΣXXΣXYΣYXΣYY]=[I00I]\begin{bmatrix}\Psi_{XX} & \Psi_{XY}\\\Psi_{YX} & \Psi_{YY}\end{bmatrix}\cdot\begin{bmatrix}\Sigma_{XX} & \Sigma_{XY}\\\Sigma_{YX} & \Sigma_{YY}\end{bmatrix}=\begin{bmatrix}I&0\\0&I\end{bmatrix}which gives us that ΨXXΣXY+ΨXYΣYY=0\Psi_{XX}\Sigma_{XY}+\Psi_{XY}\Sigma_{YY}=0 therefore     ΣXY=ΨXX1ΨXYΣYY    ΣXYΣYY1=ΨXX1ΨXY\begin{align*} \implies\Sigma_{XY}=-\Psi_{XX}^{-1}\Psi_{XY}\Sigma_{YY}\\ \implies\Sigma_{XY}\Sigma_{YY}^{-1}=-\Psi_{XX}^{-1}\Psi_{XY} \end{align*}allowing us to re-express HH and leave us with the resultant conditional density p(xy)=Ce12Q(y)e12(xΣXYΣYY1y)ΨXX(xΣXYΣYY1y)p(x\mid y)=Ce^{-\frac{1}{2}Q(y)}e^{-\frac{1}{2}(x-\Sigma_{XY}\Sigma_{YY}^{-1}y)^{\top}\Psi_{XX}(x-\Sigma_{XY}\Sigma_{YY}^{-1}y)}Giving us the first condition.

Finally, since p(xy)dx=1\int\limits p(x\mid y) \, dx=1 we necessarily have that Ce12Q(y)=1(2π)n2D12Ce^{-\frac{1}{2}Q(y)}=\frac{1}{(2\pi)^{\frac{n}{2}}|D|^{\frac{1}{2}}}which is in fact independent of y. Then, we finally have that DD which does not depend on yy is E[(XE[XY])(XE[XY])Y=y]=E[(XE[XY])(XE[XY])]\mathbb{E}[(X-\mathbb{E}[X|Y])(X-\mathbb{E}[X|Y])^{\top}\mid Y=y]=\mathbb{E}[(X-\mathbb{E}[X|Y])(X-\mathbb{E}[X|Y])^{\top}]

\end{proof}

To derive the Kalman Filter the following two lemmas are required: >[!lem|6.2.3] >If E[X]=0\mathbb{E}[X]=0 and Z1,Z2Z_{1},Z_{2} are orthogonal, zero-mean Gaussian Processes (with E[Z1Z2]=0\mathbb{E}[Z^{\top}_{1}Z_{2}]=0), then E[XZ1=z1,Z2=z2]=E[XZ1=z1]+E[XZ2=z2]\mathbb{E}[X|Z_{1}=z_{1},Z_{2}=z_{2}]=\mathbb{E}[X|Z_{1}=z_{1}]+\mathbb{E}[X|Z_{2}=z_{2}]

\begin{proof} E[X(Z1,Z2)=(z1,z2)]=E[X[Z1Z2]]E[[Z1Z2][Z1Z2]]1[z1z2]=E[XZ1]E[XZ2][E[Z1Z1]E[Z1Z2]E[Z2Z1]E[Z2Z2]]1[z1z2]=E[XZ1](E[Z1Z1])1z1+E[XZ2](E[Z2Z2])1z2=E[XZ1=z1]+E[XZ2=z2]\begin{align*} &\mathbb{E}[X\mid(Z_{1},Z_{2})=(z_{1},z_{2})]\\ &= \mathbb{E}\left[X\begin{bmatrix}Z_{1}\\Z_{2}\end{bmatrix}^{\top }\right]\mathbb{E}\left[ \begin{bmatrix}Z_{1}\\Z_{2}\end{bmatrix}\begin{bmatrix}Z_{1}\\Z_{2} \end{bmatrix}^{\top} \right] ^{-1}\begin{bmatrix}z_{1}\\z_{2}\end{bmatrix}\\ &= \mathbb{E}[XZ_{1}^{\top}]\mathbb{E}[XZ_{2}^{\top}]\begin{bmatrix}\mathbb{E}[Z_{1}^{\top}Z_{1}] & \mathbb{E}[Z_{1}^{\top}Z_{2}]\\\mathbb{E}[Z_{2}^{\top}Z_{1}] & \mathbb{E}[Z_{2}^{\top}Z_{2}]\end{bmatrix}^{-1}\begin{bmatrix}z_{1}\\z_{2}\end{bmatrix}\\ &= \mathbb{E}[XZ_{1}^{\top}](\mathbb{E}[Z_{1}Z_{1}^{\top}])^{-1}z_{1}+\mathbb{E}[XZ_{2}^{\top}](\mathbb{E}[Z_{2}Z_{2}^{\top}])^{-1}z_{2}\\ &= \mathbb{E}[X\mid Z_{1}=z_{1}]+\mathbb{E}[X\mid Z_{2}=z_{2}] \end{align*} where the third equality is due to orthogonality and the final one is due to \end{proof}

and

Lemma (6.2.4)

E[(XE[XY])(XE[XY])]\mathbb{E}[(X-\mathbb{E}[X|Y])(X-\mathbb{E}[X|Y])^{\top}] is given by DD from above

\begin{proof} First note that E[X(E[XY])]=E[(XE[XY]+E[XY])(E[XY])]=E[E[XY](E[XY])]\mathbb{E}[X(\mathbb{E}[X\mid Y])^{\top}]=\mathbb{E}[(X-\mathbb{E}[X\mid Y]+\mathbb{E}[X\mid Y])(\mathbb{E}[X\mid Y]^{\top})]=\mathbb{E}[\mathbb{E}[X\mid Y](\mathbb{E}[X\mid Y])^{\top}]since XE[XY]X-\mathbb{E}[X\mid Y] is orthogonal to E[XY]\mathbb{E}[X\mid Y] (which we know by another Iterated Expectation argument). Then E[(XE[XY])(XE[XY])]=E[XX]2E[X(E[XY])]+E[E[XY]E[XY]]=E[XX]E[E[XY]E[XY]]=ΣXXE[ΣXYΣYY1yy(ΣYY1)ΣXY]=ΣXXΣXYΣYY1ΣXY1\begin{align*} \mathbb{E}[(X-\mathbb{E}[X\mid Y])(X-\mathbb{E}[X\mid Y])^{\top}]&= \mathbb{E}[XX^{\top}]-2\mathbb{E}[X(\mathbb{E}[X|Y])^{\top}]+\mathbb{E}[\mathbb{E}[X\mid Y]\mathbb{E}[X\mid Y]^{\top}]\\ &= \mathbb{E}[XX^{\top}]-\mathbb{E}[\mathbb{E}[X\mid Y]\mathbb{E}[X\mid Y]^{\top}]\\ &= \Sigma_{XX}-\mathbb{E}[\Sigma_{XY}\Sigma_{YY}^{-1}yy^{\top}(\Sigma_{YY}^{-1})^{\top}\Sigma_{XY}^{\top}]\\ &= \Sigma_{XX}-\Sigma_{XY}\Sigma_{YY}^{-1}\Sigma_{XY}^{-1} \end{align*} \end{proof}

Now, consider the following system without control: xt+1=Axt+wtwtiidN(0,W)yt=Cxt+vtvtiidN(0,V)\begin{align*} x_{t+1}&= Ax_{t}+w_{t}&w_{t}\overset{iid}\sim \mathcal{N}(0,W)\\ y_{t}&= Cx_{t}+v_{t}&v_{t}\overset{iid}\sim \mathcal{N}(0,V) \end{align*}where xRn,uRm,wRn,yRp,vRpx\in \mathbb{R}^{n},u\in \mathbb{R}^{m},w\in \mathbb{R}^{n},y\in \mathbb{R}^{p},v\in \mathbb{R}^{p}. Define the mean process, mtm_{t}, and covariance process, Σtt1\Sigma_{t|t-1} as mt=E[xty0:t1]Σtt1=E[(xtE[xty0:t1])(xtE[xty0:t1])y0:t1]\begin{align*} m_{t}&= \mathbb{E}[x_{t}\mid y_{0:t-1}]\\ \Sigma_{t|t-1}&= \mathbb{E}[(x_{t}-\mathbb{E}[x_{t}|y_{0:t-1}])(x_{t}-\mathbb{E}[x_{t}|y_{0:t-1}])^{\top}\mid y_{0:t-1}] \end{align*}and since the estimation error covariance does not depend on the realization y0:t1y_{0:t-1} we can rewrite it as Σtt1=E[(xtE[xty0:t1])(xtE[xty0:t1])].\Sigma_{t|t-1}=\mathbb{E}[(x_{t}-\mathbb{E}[x_{t}\mid y_{0:t-1}])(x_{t}-\mathbb{E}[x_{t}\mid y_{0:t-1}])^{\top}]. >[!thm|6.2.1] >The following holds: mt+1=Amt+AΣtt1C(CΣtt1C+V)1(ytCmt)Σt+1t=AΣtt1A+W(AΣtt1C)(CΣtt1C+V)1(CΣtt1A)\begin{align*} m_{t+1}&= Am_{t}+A\Sigma_{t|t-1}C^{\top}(C\Sigma_{t|t-1}C^{\top}+V)^{-1}(y_{t}-Cm_{t})\\ \Sigma_{t+1|t}&= A\Sigma_{t|t-1}A^{\top}+W-(A\Sigma_{t|t-1}C^{\top})(C\Sigma_{t|t-1}C^{\top}+V)^{-1}(C\Sigma_{t|t-1}A^{\top}) \end{align*}with m0=E[x0],Σ01=E[x0x0]m_{0}=\mathbb{E}[x_{0}],\Sigma_{0|-1}=\mathbb{E}[x_{0}x_{0}^{\top}].

\begin{proof} mt+1=E[Axt+wty0:t]=E[Axty0:t]=E[Amt+A(xtmt)y0:t]=Amt+E[A(xtmt)y0:t1,ytE[yty0:t1]]=Amt+E[A(xtmt)y0:t1]+E[A(xtmt)ytE[yty0:t1]]by lemma 6.2.3=Amt+E[A(xtmt)ytE[yty0:t1]]=Amt+E[A(xtmt)Cxt+vtE[Cxt+vty0:t1]]=Amt+E[A(xtmt)C(xtmt)+vt]\begin{align*} m_{t+1}&= \mathbb{E}[Ax_{t}+w_{t}\mid y_{0:t}]\\ &= \mathbb{E}[Ax_{t}\mid y_{0:t}]\\ &= \mathbb{E}[Am_{t}+A(x_{t}-m_{t})\mid y_{0:t}]\\ &= Am_{t}+\mathbb{E}[A(x_{t}-m_{t})\mid y_{0:t-1},y_{t}-\mathbb{E}[y_{t}\mid y_{0:t-1}]]\\ &= Am_{t}+\mathbb{E}[A(x_{t}-m_{t})\mid y_{0:t-1}]\\ &\quad\quad\quad\quad+\mathbb{E}[A(x_{t}-m_{t})\mid y_{t}-\mathbb{E}[y_{t}\mid y_{0:t-1}]]&\text{by lemma 6.2.3}\\ &= Am_{t}+\mathbb{E}[A(x_{t}-m_{t})\mid y_{t}-\mathbb{E}[y_{t}\mid y_{0:t-1}]]\\ &= Am_{t}+\mathbb{E}[A(x_{t}-m_{t})|Cx_{t}+v_{t}-\mathbb{E}[Cx_{t}+v_{t}\mid y_{0:t-1}]]\\ &= Am_{t}+\mathbb{E}[A(x_{t}-m_{t})\mid C(x_{t}-m_{t})+v_{t}] \end{align*} Let X=A(xtmt)X=A(x_{t}-m_{t}) and Y=E[yty0:t1]=ytCxt=C(xtmt)+vtY=\mathbb{E}[y_{t}\mid y_{0:t-1}]=y_{t}-Cx_{t}=C(x_{t}-m_{t})+v_{t}. Then, by we have E[XY]=ΣXYΣYY1Y\mathbb{E}[X\mid Y]=\Sigma_{XY}\Sigma_{YY}^{-1}Yand thus,mt+1=Amt+AE[(xtmt)(xtmt)]C(E[(C(xtmt)+vt)(C(xtmt)+vt)])1(ytCxt)=Amt+AΣtt1C(CE[(xtmt)(xtmt)]C+E[vtvt])1(ytCxt)=Amt+AΣtt1C(CΣtt1C+V)1(ytCxt)\begin{align*} m_{t+1}&= Am_{t}\\ & +A\mathbb{E}[(x_{t}-m_{t})(x_{t}-m_{t})^{\top}]C^{\top}(\mathbb{E}[(C(x_{t}-m_{t})+v_{t})(C(x_{t}-m_{t})+v_{t})^{\top}])^{-1}(y_{t}-Cx_{t})\\\\ &= Am_{t}+A\Sigma_{t|t-1}C^{\top}(C\mathbb{E}[(x_{t}-m_{t})(x_{t}-m_{t})^{\top}]C^{\top}+\mathbb{E}[v_{t}v_{t}^{\top}])^{-1}(y_{t}-Cx_{t})\\ &= Am_{t}+A\Sigma_{t|t-1}C^{\top}(C\Sigma_{t|t-1}C^{\top}+V)^{-1}(y_{t}-Cx_{t}) \end{align*} Likewise, xt+1mt+1=A(xtmt)+wtAΣtt1C(CΣtt1C+V)1(ytCxt)=Σt+1t=AΣtt1A+W(AΣtt1C)(CΣtt1C+V)1(CΣtt1A)\begin{align*} x_{t+1}-m_{t+1}&= A(x_{t}-m_{t})+w_{t}-A\Sigma_{t|t-1}C^{\top}(C\Sigma_{t|t-1}C^{\top}+V)^{-1}(y_{t}-Cx_{t})\\ &= \dots\\ \Sigma_{t+1|t}&= A\Sigma_{t|t-1}A^{\top}+W-(A\Sigma_{t|t-1}C^{\top})(C\Sigma_{t|t-1}C^{\top}+V)^{-1}(C\Sigma_{t|t-1}A^{\top}) \end{align*}

\end{proof} Define now m~t=E[xty0:t]=mt+E[xtmty0:t].\tilde{m}_{t}=\mathbb{E}[x_{t}\mid y_{0:t}]=m_{t}+\mathbb{E}[x_{t}-m_{t}\mid y_{0:t}].Following the analysis above we obtain m~t=mt+E[xtmty0:t1]+E[xtmtytE[yty0:t1]]\tilde{m}_{t}=m_{t}+\mathbb{E}[x_{t}-m_{t}\mid y_{0:t-1}]+\mathbb{E}\left[x_{t}-m_{t}\mid y_{t}-\mathbb{E}[y_{t}\mid y_{0:t-1}]\right]Note that we also have mt=Am~t1m_{t}=A\tilde{m}_{t-1}. The following then results:

Theorem (6.2.2)

The recursions for m~t\tilde{m}_{t} satisfy m~t=Am~t1+Σtt1C(CΣtt1C+V)1(ytCAm~t1)\tilde{m}_{t}=A\tilde{m}_{t-1}+\Sigma_{t|t-1}C^{\top}(C\Sigma_{t|t-1}C^{\top}+V)^{-1}(y_{t}-CA\tilde{m}_{t-1})with m~0=E[x0y0]\tilde{m}_{0}=\mathbb{E}[x_{0}\mid y_{0}].

Linked from