Google Research Football (GRF)

Evaluation Metrics:

Approaches:

Original paper baseline
- IMPALA (Importance Weighted Actor-Learner Architectures) [7.8 on Easy]
- Ape-X DQN (Distributed Prioritized Experience Replay with DQN) [6.5 on easy]
- PPO [2.8 on Easy]
MARL
- TiZero
  - self play, and JRPO (joint policy optimization)
- TiKick
  - MAPPO, Training on Self play data of expert (WeKick - winner of Kaggle GRF)
- JiDi_3rd
- Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
  
  Uses Multi Agent Transformer (MAT)
- Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
  
  Survey of MARL techniques: IPPO, MAPPO (Multi-Agent PPO), HAPPO (Heterogeneous-Agent PPO), A2PO, MAT