Flappy Bird: UCB, Bootstrapped DQN

Other project members:
Sachin Vernekar,
Jian Deng,
Hamidreza Shahidi

Original Code:
github.com/yanpanlau/Keras-FlappyBird

Code:
github.com/djian2017/flappy-bird

The original code demonstrates DQN with Keras and uses a pygame implementation of Flappy Bird.

The main idea was to see the benefits of incorporating uncertainty estimates with bootstrapped DQN. We did get faster convergence with UCB1 estimates:

However, surprisingly, vanilla bootstrap outperformed in terms of max scores. Although, bootstrap with UCB1 was indeed better than DQN. The reason could be that the action space is too small (two actions: jump or not). Maybe this will produce effective with bigger action spaces.

Method Max score achieved
DQN 67
bootstrap 243
bootstrap with majority voting 740
UCB1 147
bootstrap with UCB1 81
bootstrap with UCB1 and majority voting 197

References

  1. Osband, Ian, et al. “Deep exploration via bootstrapped DQN.” Advances in neural information processing systems. 2016.