Flappy Bird: UCB, Bootstrapped DQN
The original code demonstrates DQN with Keras and uses a pygame implementation of Flappy Bird.
Objective

The main idea was to see the benefits of incorporating uncertainty estimates with bootstrapped DQN \cite{osband2016deep} for the game of Flappy Bird.
Results
We did get faster convergence with UCB1 estimates:

However, surprisingly, vanilla bootstrap outperformed in terms of max scores. Although, bootstrap with UCB1 was indeed better than DQN. The reason could be that the action space is too small (two actions: jump or not). Maybe this will produce effective with bigger action spaces.
Method | Max score achieved |
---|---|
DQN | 67 |
bootstrap | 243 |
bootstrap with majority voting | 740 |
UCB1 | 147 |
bootstrap with UCB1 | 81 |
bootstrap with UCB1 and majority voting | 197 |