FIND ME ON

GitHub

LinkedIn

Admissible Policy

🌱

Definition
StochasticControl

Let H0:=X\mathbb{H}_{0}:=\mathbb{X}, Ht=Ht1×K\mathbb{H}_{t}=\mathbb{H}_{t-1}\times \mathbb{K}, for t=1,2,t=1,2,\dots. We let ItI_{t} denote an element of Ht\mathbb{H}_{t} where It={x[0,t],u[0,t1]}.I_{t}=\{ x_{[0,t]},u_{[0,t-1]} \}.A deterministic admissible control policy γ\gamma is a sequence of functions {γt,tZ+}\{\gamma_{t},t\in\mathbb{Z}_{+}\} such that γ:HtU\gamma:\mathbb{H}_{t}\to \mathbb{U} with ut=γt(It)u_{t}=\gamma_{t}(I_{t})

We can also state this as follows: > [!remark|*] Alternate Definition >Let us write that utu_{t} is a realization of the action random variable UtU_{t} under an admissible policy, and we would like to also emphasize that HtH_{t} is a random variable with realization ItI_{t}. We say that γt\gamma_{t} is a measurable function on σ(Ht)\sigma(H_{t}) in the sense that for every Borel subset BUB\subset \mathbb{U} we have that {ω:Ut(ω)B}=Ut1(B)σ(Ht)\{ \omega :U_{t}(\omega)\in B \}=U_{t}^{-1}(B)\subset\sigma(H_{t})

A randomized admissible control policy is a sequence γ={γt,t0}\gamma=\{ \gamma_{t},t\ge 0 \} such that γ:HtP(U)\gamma:\mathbb{H}_{t}\to \mathcal{P}(\mathbb{U}) with P(U)\mathcal{P}(\mathbb{U}) being the set of probability measures on U\mathbb{U}, so that for every realization ItI_{t}, we have that γt(It)\gamma_{t}(I_{t}) is a probability measure on U\mathbb{U}. By Stochastic Realization arguments this is equivalent to writing ut=γt(It,rt)u_{t}=\gamma_{t}(I_{t},r_{t})for some [0,1][0,1]-valued i.i.d. random variable rtr_{t}.

Linked from