FIND ME ON

GitHub

LinkedIn

Markov Policy is Good Enough

🌱

Theorem
StochasticControl

Let {(xt,ut)}\{ (x_{t},u_{t}) \} be a controlled Markov chain. Consider the Finite Horizon Optimization problem: JN(X,γ)=Exγ[k=0N1c(Xk,Uk)+cN(XN)]J_{N}(X,\gamma)=E_{x}^{\gamma}\left[ \sum_{k=0}^{N-1}c(X_{k},U_{k})+c_{N}(X_{N}) \right]where we seek to minimize the cost over all admissible policies. Any such policy can be replaced with one which is Markov and which is at least as good as the original policy. i.e. there is no loss in restricting policies to be Markov.

Linked from