NAVIGATION
Home
Research
Bookshelf
Garden
FIND ME ON
GitHub
LinkedIn
🌱
A deterministic stationary control policy γ∈ΓS\gamma\in\Gamma_{S}γ∈ΓS, s.t. γ:X→U\gamma:\mathbb{X}\to \mathbb{U}γ:X→U is a sequence of identical functions {γ,γ,… }\{ \gamma,\gamma,\dots \}{γ,γ,…}such that ut=γ(xt), ∀t∈Z+u_{t}=\gamma(x_{t}), \ \forall t\in\mathbb{Z}_{+}ut=γ(xt), ∀t∈Z+
Markov Policy induces Markov Chain
Controllability & Observability w.r.t. Riccati
Q-Learning
Optimality of Q Learning Algorithm