Partially Observable Markov Decision Process

NAVIGATION

Home

Research

Bookshelf

Garden

FIND ME ON

GitHub

Home

Research

Bookshelf

Garden

Partially Observable Markov Decision Process

🌱

StochasticControl

A Partially Observed Markov Decision Process a.k.a. POMDP, is a seven tuple $(\mathbb{X}, \mathbb{U}, \mathbb{Y}, \mathbb{K},\mathcal{T}, Q,c)$ where: - $\mathbb{X}$ is the state space, a subset of a Polish space. - $\mathbb{U}$ is the action space, a subset of a Polish space. - $\mathbb{Y}$ is the observation space, a subset of a Polish space. - $\mathbb{K}=\{ (x,u):u\in\mathbb{U}(x),x\in\mathbb{X} \}$ is the set of state-control pairs that are feasible. - $\mathcal{T}:\mathbb{X}\times \mathbb{U}\to [0,1]^{\lvert\mathbb{X}\rvert}$ is the state transition kernel i.e. $\mathcal{T}(A\mid x_{t},u_{t})=P(x_{t+1}\in A\mid x_{t},u_{t})$ where $A\in\mathcal{B}(\mathbb{X})$ . - $Q:\mathbb{X}\to [0,1]^{\lvert\mathbb{Y}\rvert}$ is the observation channel i.e. $Q(A\mid x_{t})=P(y_{t}\in A\mid x_{t})$ where $A\in\mathcal{B}(\mathbb{Y})$ . - $c:\mathbb{K}\to \mathbb{R}$ is the cost function

Linked from

Mapping

SLAM

Belief MDP

SLAM Presentation

Required Conditions for Finite Memory Q-learning