FIND ME ON

GitHub

LinkedIn

Required Conditions for Finite Memory Q-learning

🌱

In [1], two fundamental properties are required to determine some Lipschitz constant for the Bounded-Lipschitz distance between the filter Stochastic Kernel η\eta these are: 1. Filter Stability 2. Exponential Filter Stability

[2] defines 2 as follows: >[!def] Exponential Filter Stability > If (1δ(T))(1δ(Q))<1(1-\delta(\mathcal{T}))(1-\delta(Q))<1 where δ(T),δ(Q)\delta(\mathcal{T}),\delta(Q) are the Dobrushin’s Ergodic Coefficient for the transition and observation kernels of the POMDP then we say the filter is exponentially stable.

Pretty much what we have is that

Bib

[1] A. Kara and S. Yuksel, “Near optimality of finite memory feedback policies in partially observed markov decision processes,” Journal of Machine Learning Research, vol. 23, no. 11, pp. 1–46, 2022. [2] C. McDonald and S. Yüksel, “Exponential filter stability via Dobrushin’s coefficient,” Electronic Communications in Probability, vol. 25, no. none, pp. 1–13, Jan. 2020, doi: 10.1214/20-ECP333.