Markov Decision Process — Garden

NAVIGATION

Home

Research

Bookshelf

Garden

FIND ME ON

GitHub

LinkedIn

Home

Research

Bookshelf

Garden

Markov Decision Process

🌱

Definition

StochasticDiffs

A Fully Observed Markov Control Problem o/w known as a MDP, is a five tuple $(\mathbb{X}, \mathbb{U},\mathbb{K},\mathcal{T},c)$ where: - $\mathbb{X}$ is the state space, a subset of a Polish space. - $\mathbb{U}$ is the action space, a subset of a Polish space. - $\mathbb{K}=\{ (x,u):u\in\mathbb{U}(x),x\in\mathbb{X} \}$ is the set of state-control pairs that are feasible. - $\mathcal{T}$ is the state transition kernel i.e. $\mathcal{T}(A\mid x_{t},u_{t})=P(x_{t+1}\in A\mid x_{t},u_{t})$ - $c:\mathbb{K}\to \mathbb{R}$ is the cost function

Linked from

Belief MDP

Partially Observable Markov Decision Process

Q-Learning