FIND ME ON

GitHub

LinkedIn

Value Iteration Algorithm

🌱

Theorem
StochasticControl

Theorem

Suppose the cost function is non-negative. Consider the successive iteration vn(x)=minu{c(x,u)+βXvn1(y)T(dyx,u)},x,n1v_{n}(x)=\min_{u}\left\{ c(x,u)+\beta \int\limits _{\mathbb{X}}v_{n-1}(y)\,\mathcal{T}(dy|x,u) \right\},\forall x,n\ge 1with v0(x)=0,xXv_{0}(x)=0,\forall x\in\mathbb{X}. Then, vnv_{n} is a monotonically non-decreasing sequence. If this sequence converges point wise to a function vv where v(x)=c(x,f(x))+βv(y)T(dyx,f(x))v(x)=c(x,f(x))+\beta \int\limits v(y) \, \mathcal{T}(dy|x,f(x)) is such that with γ={f,f,}\gamma=\{ f,f, \dots \} then limnβnExγ[v(xn)]=0\lim_{ n \to \infty } \beta^{n}E_{x}^{\gamma}[v(x_{n})]=0which means γ\gamma is optimal and vv is the value function.