FIND ME ON

GitHub

LinkedIn

Partially Observable Markov Decision Process

🌱

StochasticControl

A Partially Observed Markov Decision Process a.k.a. POMDP, is a seven tuple (X,U,Y,K,T,Q,c)(\mathbb{X}, \mathbb{U}, \mathbb{Y}, \mathbb{K},\mathcal{T}, Q,c) where: - X\mathbb{X} is the state space, a subset of a Polish space. - U\mathbb{U} is the action space, a subset of a Polish space. - Y\mathbb{Y} is the observation space, a subset of a Polish space. - K={(x,u):u∈U(x),x∈X}\mathbb{K}=\{ (x,u):u\in\mathbb{U}(x),x\in\mathbb{X} \} is the set of state-control pairs that are feasible. - T:XƗU→[0,1]∣X∣\mathcal{T}:\mathbb{X}\times \mathbb{U}\to [0,1]^{\lvert\mathbb{X}\rvert} is the state transition kernel i.e.Ā T(A∣xt,ut)=P(xt+1∈A∣xt,ut)\mathcal{T}(A\mid x_{t},u_{t})=P(x_{t+1}\in A\mid x_{t},u_{t})where A∈B(X)A\in\mathcal{B}(\mathbb{X}). - Q:X→[0,1]∣Y∣Q:\mathbb{X}\to [0,1]^{\lvert\mathbb{Y}\rvert} is the observation channel i.e.Ā Q(A∣xt)=P(yt∈A∣xt)Q(A\mid x_{t})=P(y_{t}\in A\mid x_{t})where A∈B(Y)A\in\mathcal{B}(\mathbb{Y}). - c:K→Rc:\mathbb{K}\to \mathbb{R} is the cost function

Linked from