🌱
Let be a controlled Markov chain. Consider the Finite Horizon Optimization problem: where we seek to minimize the cost over all admissible policies. Any such policy can be replaced with one which is Markov and which is at least as good as the original policy. i.e. there is no loss in restricting policies to be Markov.