Evolutionary Computation

Cartesian Genetic Programming


Evolving simple programs for playing Atari games


Dennis G Wilson, Sylvain Cussat-Blanc, Hervé Luga

Overview

  • Arcade Learning Environment
  • Cartesian Genetic Programming
  • Atari playing with CGP
  • Example programs from results

Arcade Learning Environment

[Bellemare et al., 2013]

  • HyperNEAT [Hausknecht, 2012]
  • DQN [Mnih et al., 2015]
  • Dueling DQN [Wang et al., 2015]
  • Prioritized experience replay [Schaul et al., 2015]
  • Double DQN [Hasselt et al., 2016]
  • A3C [Mnih et al., 2016]
  • Tangled Problem Graphs [Kelly and Heywood, 2017]

Deep reinforcement learning

[Mnih et al., 2015]

HyperNEAT: Multiple representations

[Hausknecht, 2012]

TPG: Decimal Feature Grid

[Kelly and Heywood, 2017]

TPG: Multi-Task Learning

[Kelly and Heywood, 2017]

Cartesian Genetic Programming

[Miller, 2011]

Floating Point Cartesian Genetic Programming

[Wilson, 2018]

CGP: Junk nodes

[Miller, 2001]

Mixed-Type CGP

[Harding et al., 2012]

Image-Processing CGP

[Harding et al., 2013]

Cartesian Genetic Programming

[Harding et al., 2013]

Cartesian Genetic Programming

[Paris et al., 2015]

CGP for Atari playing


1+λ ES
Fitness: Total reward over 1 game

CGP for Atari playing

Mathematical functions

Statistical functions

Array functions

Results

Centipede


Human Double DQN Prioritized A3C:FF A3C:LSTM TPG HyperNEAT CGP
11963 3853.5 4881 3421.9 3755.8 1997 34731.7 25275.2 24708

Kung Fu Master



Human Double DQN Prioritized A3C:FF A3C:LSTM TPG HyperNEAT CGP
22736 30207 24288 31244 28819 40835 7720 57400

Boxing



Human Double DQN Prioritized A3C:FF A3C:LSTM TPG HyperNEAT CGP
4.3 73.5 77.3 68.6 59.8 37.3 16.4 38.4

Comparison

Game Human Double DQN Prioritized A3C:FF HyperNEAT CGP
Asteroids 13157 1193.2 2035.4 1654 4474.5 1694 9412
Defender 27510 33996 21093.5 56533 14620 993010
Gravitar 2672 200.5 297 218 303.5 370 2350
JamesBond 406.7 573 835.5 3511.5 541 5660 6130
Kangaroo 3035 11204 10334 10241 94 800 1400
Krull 2395 6796.1 8051.6 7406.5 5560 12601.4 9086.8
Ms. Pacman 15693 1241.3 2250.6 1824.6 653.7 3408 2568
Private Eye 69571 -575.5 292.6 179 206.9 10747.4 12702.2
Skiing -11490.4 -11928 -10852.8 -10911.1 -7983.6 -9011
Solaris 810 1768.4 2238.2 1956 160 8324
YarsRevenge 6270.6 25976.5 5965.1 7157.5 24096.4 28838.2


Two years later: Agent 57 outperforms many of these