Markov Decision Process — Garden

NAVIGATION

Home

Research

Bookshelf

Garden

FIND ME ON

GitHub

LinkedIn

Email

Home

Research

Bookshelf

Garden

Markov Decision Process

Definition (Markov Decision Process)

A Fully Observed Markov Control Problem o/w known as a MDP, is a five tuple $(\mathbb{X}, \mathbb{U},\mathbb{K},\mathcal{T},c)$ where:

$\mathbb{X}$ is the state space, a subset of a Polish space.
$\mathbb{U}$ is the action space, a subset of a Polish space.
$\mathbb{K}=\{ (x,u):u\in\mathbb{U}(x),x\in\mathbb{X} \}$ is the set of state-control pairs that are feasible.
$\mathcal{T}$ is the state transition kernel i.e. $\mathcal{T}(A\mid x_{t},u_{t})=P(x_{t+1}\in A\mid x_{t},u_{t})$
$c:\mathbb{K}\to \mathbb{R}$ is the cost function

Linked from

Belief MDP

Partially Observable Markov Decision Process

Q-Learning