Consider the following linear control system xt+1=Axt+But+wtwhere x,w∈Rn,u∈Rm. Suppose that {wt}t≥0 is iid with E[wt]=0,E[wtwtT]=W, ∀t≥0. The goal is to find the minimum cost, γ∈ΓAinfJ(x,γ)=γ∈ΓAinfExγ(k=0∑N−1xkTQxk+ukTRuk)+terminal costxNTQNxNwhere Q=QT≥0positive semidefinite and R=RT>0 positive definite.
Theorem (Solution to LQR using Riccati Equation)
For the LQ problem, let JN=QN and Pt=Q+ATPt+1A−ATPt+1+B(R+BTPt+1B)−1BTPt+1Awith final condition PN=QN. The optimal cost is γ∈ΓAinfJ(x,γ)=J(x0)=x0TP0x0+E[wtTPt+1wt]with our optimal control being γt∗=−(BTPt+1B+R)−1BTPt+1Axt
If (A,B) is controllable there exists a solution to the Riccati EquationP=Q+ATPA−ATP+B(R+BTPB)−1BTPA
If (A,B) is controllable and, with Q=CTC, (A,C) is observable; as t→∞ the sequence of Riccati recursions Pt=Q+ATPt+1A−ATPt+1+B(R+BTPt+1B)−1BTPt+1Aconverges to some limit P that satisfies P=Q+ATPA−ATP+B(R+BTPB)−1BTPAThat is, convergence takes place for any initial condition Pˉ. Furthermore, such a P is unique, and is positive definite. Finally under the optimalstationary control policyut=−(BTPB+R)−1BTPAxtthe solution to xt+1=Axt+But is stable; i.e. xt→0
Under the conditions of 2, the stationary policy above minimizes N→∞limsupN1Exγ[t=0∑N−1xtTQxt+utTRut]for the following system xt+1=Axt+But+wtfor every x∈Rn. Furthermore, the optimal cost is E[wTPw]=Trace(PW).