Oases of Cooperation: An Empirical Evaluation of Reinforcement Learning in the Iterated Prisoner’s Dilemma

Paper by John Burden, Peter Barnett
Published on 13 March 2022


In the creation of safe AI systems it is extremely important to ensure cooperative behaviour of these systems, even when there are incentives to act selfishly. In many cases, even when game-theoretic solutions allow for cooperation, actually getting the AI systems to converge on these solutions through training is difficult. In this paper we empirically evaluate how reinforcement learning agents can be encouraged to cooperate (without opening themselves up to exploitation) by selecting appropriate hyperparameters and environmental perceptions for the agent. Our results in the multi-agent scenario indicate that in hyperparameter-space there are isolated “oases” of mutual cooperation, and small changes in these hyperparameters can lead to sharp drops into non-cooperative behaviour.

Read full paper

Subscribe to our mailing list to get our latest updates