Solutions for finite static teams

#StochasticControl >[!thm] Existence for NN-agent team >For an NN-agent static stochastic team problem satisfying the four hypotheses, there exists at least one team-optimal solution.

Cor

We can relax (c.1)(c.1) into (c.1)(c.1') and the result still holds:

  1. (c.1)(c.1') Let Nh\mathcal{N}_{h} and Ns\mathcal{N}_{s} be two complementary subsets of N\mathcal{N} (i.e. NhNs=N\mathcal{N}_{h}\cup \mathcal{N}_{s}=\mathcal{N}, and NhNs=\mathcal{N}_{h}\cap \mathcal{N}_{s}=\emptyset) s.t. SiS^{i} is compact iNh\forall i\in\mathcal{N}_{h} and SjUjS^{j}\equiv \mathbb{U}^{j} jNs\forall j\in\mathcal{N}_{s}. Assume that jNsuj, L(ξ;u1,,uN)\sum_{j\in \mathcal{N}_{s}}|u^{j}|\to \infty,\ L(\xi;u^{1},\dots,u^{N})\to \infty a.s., for every fixed uiSi,iNhu^{i}\in S^{i},i\in \mathcal{N}_{h}.

Theorem (2.4.3)

In addition to the four hypotheses, let

  1. SiS^{i} be a convex set for each iNi\in\mathcal{N}, and;
  2. L(ξ;)L(\xi;\mathbf{\cdot}) be strictly convex on U\mathbf{U} a.s.. Then, the stochastic team problem admits a unique team-optimal solution.

Lemma (2.4.1)

Let L:Rm1××RmNRL:\mathbb{R}^{m_{1}}\times\dots \times \mathbb{R}^{m_{N}}\to \mathbb{R} be a convex (deterministic) loss function, with pbp optimal solution u:=(u1,,uN)\mathbf{u}^{\circ}:=(u^{1\circ},\dots,u^{N\circ}). If LL is continuously differentiable at u\mathbf{u}^{\circ}, then u\mathbf{u}^{\circ} is globally (team) optimal.

Intuition

So all we need is our loss function to be differentiable at our pbp solution for it to also be team optimal.

Now, using this lemma and the definition of a Stationary Team Policy we can show the following:

Theorem (2.4.4)

For an NN-agent static stochastic team problem, let

  1. The hypotheses (c.3)(c.3) and (c.4)(c.4) be satisfied;
  2. SiS^{i} be an Open convex subset of a finite dimensional Vector Space for each iNi\in\mathcal{N};
  3. L(ξ;)L(\xi;\cdot) be convex and continuously differentiable on S:=S1××SN\mathbf{S}:=S^{1}\times\dots \times S^{N}

Under these conditions, if the policy γ\underline{\gamma}^{\circ}, taking values in S\mathbf{S}, is stationary, it is team-optimal.

Linked from