Linear Quadratic Problem

Consider the following linear control system xt+1=Axt+But+wtx_{t+1}=Ax_{t}+Bu_{t}+w_{t}where x,wRn,uRmx,w\in\mathbb{R}^{n},u\in\mathbb{R}^{m}. Suppose that {wt}t0\{ w_{t} \}_{t\ge 0} is iid with E[wt]=0,E[wtwtT]=WE[w_{t}]=0,E[w_{t}w_{t}^{T}]=W, t0\forall t\ge 0. The goal is to find the minimum cost, infγΓAJ(x,γ)=infγΓAExγ[(k=0N1xkTQxk+ukTRuk)+xNTQNxNterminal cost]\inf_{\gamma\in\Gamma_{A}}J(x,\gamma)=\inf_{\gamma\in\Gamma_{A}}E_{x}^{\gamma}\left[\left( \sum_{k=0}^{N-1}x_{k}^{T}Qx_{k}+u_{k}^{T}Ru_{k} \right)+ \underbrace{ x_{N}^{T}Q_{N}x_{N} }_{ \text{terminal cost} } \right]where Q=QT0Q=Q^{T}\ge 0 positive semidefinite and R=RT>0R=R^{T}>0 positive definite.

Theorem (Solution to LQR using Riccati Equation)

For the LQ problem, let JN=QNJ_{N}=Q_{N} and Pt=Q+ATPt+1AATPt+1+B(R+BTPt+1B)1BTPt+1AP_{t}=Q+A^{T}P_{t+1}A-A^{T}P_{t+1}+B(R+B^{T}P_{t+1}B)^{-1}B^{T}P_{t+1}Awith final condition PN=QNP_{N}=Q_{N}. The optimal cost is infγΓAJ(x,γ)=J(x0)=x0TP0x0+E[wtTPt+1wt]\inf_{\gamma\in\Gamma_{A}}J(x,\gamma)=J(x_{0})=x_{0}^{T}P_{0}x_{0}+E[w_{t}^{T}P_{t+1}w_{t}]with our optimal control being γt=(BTPt+1B+R)1BTPt+1Axt\gamma_{t}^{*}=-(B^{T}P_{t+1}B+R)^{-1}B^{T}P_{t+1}Ax_{t}

Theorem (Controllability & Observability w.r.t. Riccati)

Consider the system xt+1=Axt+But,yt=Cxtx_{t+1}=Ax_{t}+Bu_{t},\quad y_{t}=Cx_{t}

  1. If (A,B)(A,B) is controllable there exists a solution to the Riccati Equation P=Q+ATPAATP+B(R+BTPB)1BTPAP=Q+A^{T}PA-A^{T}P+B(R+B^{T}PB)^{-1}B^{T}PA
  2. If (A,B)(A,B) is controllable and, with Q=CTCQ=C^{T}C, (A,C)(A,C) is observable; as tt\to\infty the sequence of Riccati recursions Pt=Q+ATPt+1AATPt+1+B(R+BTPt+1B)1BTPt+1AP_{t}=Q+A^{T}P_{t+1}A-A^{T}P_{t+1}+B(R+B^{T}P_{t+1}B)^{-1}B^{T}P_{t+1}Aconverges to some limit PP that satisfies P=Q+ATPAATP+B(R+BTPB)1BTPAP=Q+A^{T}PA-A^{T}P+B(R+B^{T}PB)^{-1}B^{T}PAThat is, convergence takes place for any initial condition Pˉ\bar{P}. Furthermore, such a PP is unique, and is positive definite. Finally under the optimal stationary control policy ut=(BTPB+R)1BTPAxtu_{t}=-(B^{T}PB+R)^{-1}B^{T}PAx_{t}the solution to xt+1=Axt+Butx_{t+1}=Ax_{t}+Bu_{t} is stable; i.e. xt0x_{t}\to0
  3. Under the conditions of 2, the stationary policy above minimizes lim supN1NExγ[t=0N1xtTQxt+utTRut]\limsup_{ N \to \infty } \frac{1}{N}E_{x}^{\gamma}\left[ \sum_{t=0}^{N-1}x_{t}^{T}Qx_{t}+u_{t}^{T}Ru_{t} \right] for the following system xt+1=Axt+But+wtx_{t+1}=Ax_{t}+Bu_{t}+w_{t}for every xRnx\in\mathbb{R}^{n}. Furthermore, the optimal cost is E[wTPw]=Trace(PW)E[w^{T}Pw]=\text{Trace}(PW).

Linked from