A Fully Observed Markov Control Problem o/w known as a MDP, is a five tuple (X,U,K,T,c) where: - X is the state space, a subset of a Polish space. - U is the action space, a subset of a Polish space. - K={(x,u):uāU(x),xāX} is the set of state-control pairs that are feasible. - T is the state transition kernel i.e.Ā T(Aā£xtā,utā)=P(xt+1āāAā£xtā,utā) - c:KāR is the cost function