Radner Krainak Theorem

Theorem (2.4.5)

Let,

  • {J;Γi,iN}\{ J;\Gamma^{i},i\in\mathcal{N} \} be a static stochastic Team problem where UiRmi,iN\mathbb{U}^{i}\equiv \mathbb{R}^{m_{i}},i\in\mathcal{N} (i.e. uncountable);
  • the loss function L(ξ,u)L(\xi,\mathbf{u}) is convex and continuously differentiable in u\mathbf{u} a.s.;
  • J(γ)J(\underline{\gamma}) is bounded from below on Γ\mathbf{\Gamma};
  • γ\underline{\gamma}^{*} be a policy NN-tuple with a finite cost and suppose that for every γΓ\underline{\gamma}\in\mathbf{\Gamma} s.t. J(γ)<J(\underline{\gamma})<\infty, iNE[uiL(ξ;γ(y))[γi(yi)γi(yi)]]0(⭐)\tag{⭐}\sum_{i\in\mathcal{N}}E\left[\nabla_{u^{i}}L(\xi;\underline{\gamma}^{*}(\mathbf{y}))[\gamma^{i}(y^{i})-\gamma^{i*}(y^{i})]\right]\ge 0where uiL(ξ;γ(y))\nabla_{u^{i}}L(\xi;\gamma^{*}(\mathbf{y})) stands for the partial derivatives under the policy γ\underline{\gamma}^{*}. Then, γ\underline{\gamma}^{*} is a team-optimal policy, and it is unique if LL is strictly convex in u\mathbf{u}.

Remark

As noted in the textbook, this theorem arises due to us now considering uncountable (but still finite dimensional) measurement spaces. This causes hypothesis one, (c.1)(c.1), to no longer imply our policy space Γ\mathbf{\Gamma} is compact.

Assumptions

(c.5)(c.5)

For all γΓ\underline{\gamma}\in\Gamma s.t. J(γ)<J(\underline{\gamma})<\infty, the following RVs are integrable: uiL(ξ;γ(y))[γi(yi)γi(yi)],iN\nabla_{u^{i}}L(\xi;\underline{\gamma}^{*}(\mathbf{y}))[\gamma^{i}(y^{i})-\gamma^{i*}(y^{i})],\quad i\in\mathcal{N}

(c.6)(c.6)

Γi\Gamma^{i} is a Hilbert Space for each iNi\in\mathcal{N}, and J(γ)<J(\underline{\gamma})<\infty for all γΓ\underline{\gamma}\in\Gamma. Furthermore, Eξyi[uiL(ξ;γ(y))]ΓiiNE_{\xi|y^{i}}\left[ \nabla_{u^{i}}L(\xi;\underline{\gamma}^{*}(\mathbf{y})) \right]\in\Gamma^{i}\quad i\in\mathcal{N} >[!thm] Stationary Radner Krainak >Let {J;Γi,iN}\{ J;\Gamma^{i},i\in\mathcal{N} \} be a Static stochastic Team problem which satisfies all of the hypotheses of Theorem 2.4.5, with the exception of (⭐). Instead let either (c.5)(c.5) or (c.6)(c.6) hold. Then, if γΓ\underline{\gamma}^{*}\in\mathbf{\Gamma} is a stationary policy it is also team-optimal. Such a policy is unique if L(ξ;u)L(\xi;\mathbf{u}) is strictly convex in u\mathbf{u}, a.s..

Linked from