Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm

TLDR

A multiagent Q-learning method is designed under general-sum stochastic games, and it is proved that it converges to a Nash equilibrium under speci ed conditions.