# Bifurcation Analysis for Reinforcement Learning Agents

*This project aims at using bifurcation analysis techniques for the study of the dynamics of reinforcement learning agents.*

## Introduction

The application of reinforcement learning algorithms to multiagent domains may cause
complex non-convergent dynamics.
The replicator dynamics, commonly used in evolutionary game theory, proved to be effective for
modeling the learning dynamics in normal form games. Nonetheless, it is often interesting to
study the robustness of the learning dynamics when either learning or structural
parameters are perturbed.
This is equivalent to unfold the catalog of learning dynamical scenarios that arise for all possible
parameter settings which, unfortunately, cannot be obtained through ``brute force* simulation of the*
replicator dynamics. The analysis of bifurcations, i.e.,
critical parameter combinations at which the learning behavior undergoes radical changes, is mandatory.
In this
work, we introduce a one-parameter bifurcation analysis of the Selten's Horse game in which the
learning process exhibits a set of complex dynamical scenarios even for relatively small perturbations
on payoffs.

## Ongoing work

We are now investigating how to apply bifurcation analysis for the study of the dynamics of simple reinforcement learning sysmtes.

### Selten's Horse Game

### Changing the Replicator Dynamics

## Resources

## People involved

- Alessandro Lazaric, PhD Student

- Josè Enrique Munoz, PhD Student

- Fabio Dercole, PhD

- Marcello Restelli, PhD

## Publications

*A. Lazaric, E. Munoz de Cote, F. Dercole, M. Restelli*-

**Bifurcation Analysis of Reinforcement Learning Agents**

- Adaptive and Learning Agents and Multi-Agent Systems (ALAMAS) pp. 111--125, Maastricht, The Netherlands, April, 2007
- Bibtex
**Auteur :**A. Lazaric, E. Munoz de Cote, F. Dercole, M. Restelli**Titre :**Bifurcation Analysis of Reinforcement Learning Agents**Dans :**Adaptive and Learning Agents and Multi-Agent Systems (ALAMAS) -**Adresse :**Maastricht, The Netherlands**Date :**April 2007