FIND ME ON

GitHub

LinkedIn

Controlled Markov Chain

🌱

Definition
StochasticControl

Let {(xk,uk)}\{ (x_{k},u_{k}) \} be a collection that satisfies this model: xk+1=f(xk,uk,wk) x_{k+1}=f(x_{k},u_{k},w_{k}) where xt∈Xx_{t}\in\mathbb{X} represents the state variable, ut∈Uu_{t}\in\mathbb{U} represents the action variable, wt∈Ww_{t}\in\mathbb{W} is an i.i.d noise process, and ff a measurable function. We assume that X,U,W\mathbb{X},\mathbb{U},\mathbb{W} are Borel subsets of Polish spaces; these subsets are also called standard Borel. If {(xk,uk)}\{ (x_{k},u_{k}) \} also satisfies P(xk+1∈B∣x[0,k]=a[0,t],u[0,k]=b[0,t])=P(xk+1∈B∣xk=ak,uk=bk)āˆ€B∈B(R),k∈Z+ P(x_{k+1}\in B|x_{[0,k]}=a_{[0,t]},u_{[0,k]}=b_{[0,t]})= P(x_{k+1}\in B|x_{k}=a_{k},u_{k}=b_{k})\quad\forall B\in\mathcal{B}(\mathbb{R}),k\in\mathbb{Z}_{+} Then we call {(xk,uk)}\{ (x_{k},u_{k}) \} a controlled Markov chain.

conccccccxx # More information Consider this model again: xk+1=f(xk,uk,wk) x_{k+1}=f(x_{k},u_{k},w_{k})

^statespace

where xt∈Xx_{t}\in\mathbb{X}, ut∈Uu_{t}\in\mathbb{U}, wt∈Ww_{t}\in\mathbb{W}, ff a measurable function, X,U,W\mathbb{X},\mathbb{U},\mathbb{W} are standard Borel. We assume all random variables live in some probability space (Ω,F,P)(\Omega,\mathcal{F},P). The collection, {(xk,uk)}\{ (x_{k},u_{k}) \}, satisfying also satisfies

P(xk+1∈B∣x[0,k]=a[0,t],u[0,k]=b[0,t])=P(xk+1∈B∣xk=ak,uk=bk)=:T(B∣at,bt) \begin{align*} P(x_{k+1}\in B|x_{[0,k]}=a_{[0,t]},u_{[0,k]}=b_{[0,t]})&= P(x_{k+1}\in B|x_{k}=a_{k},u_{k}=b_{k})\\ &=:\mathcal{T}(B\mid a_{t},b_{t}) \end{align*} ^property

āˆ€B∈B(R),k∈Z+\forall B\in\mathcal{B}(\mathbb{R}),k\in\mathbb{Z}_{+}, where T(ā‹…āˆ£x,u)\mathcal{T}(\cdot\mid x,u) is a Stochastic Kernel s.t. T:XƗU→X\mathcal{T}:\mathbb{X}\times \mathbb{U}\to \mathbb{X} so that: >- For every B∈B(R)B\in\mathcal{B}(\mathbb{R}), T(Bāˆ£ā‹…,ā‹…)\mathcal{T}(B\mid \cdot,\cdot) is a measurable function on XƗU\mathbb{X}\times \mathbb{U} and; >- For every fixed (a,b)∈XƗU(a,b)\in\mathbb{X}\times \mathbb{U}, T(ā‹…āˆ£x,u)\mathcal{T}(\cdot\mid x,u) is a probability measure on (X,B(X))(\mathbb{X},\mathcal{B}(\mathbb{X})).

That is, all Stochastic Processes that satisfy , admit a Stochastic Realization in the form of almost surely.

For the process {xt,ut}\{ x_{t}, u_{t} \} to define a Stochastic Process, in addition to a transition kernel and an initial measure on x0x_{0} (i.e.Ā a prior), we need to specify the dependence of utu_{t} on the history of the process. Once this is established through Ionescu Tulcea Theorem, one can construct a Stochastic Process {xt,ut}t≄0\{ x_{t},u_{t} \}_{t\ge 0}. This dependence is called a control Policy.

Linked from