Last active
June 14, 2017 20:29
-
-
Save chuchro3/eb454a8e2a5e96e536938e2d6f050fa0 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This is a submission of the second training epoch of Flappy Bird using Deep Q-Learning, built with Python and Tensorflow. | |
The first epoch was trained with the following settings: | |
First buffer with 1000 random iterations | |
training episodes: 20000 | |
learning rate for Adam Optimizer: 1e-5 | |
we used our sinusoidal epsilon function with parameters: | |
starting epsilon: 1.0 | |
epsilon decay rate: .9998 | |
number of epsilon cycles: 7 | |
replay memory size: 75000 | |
batch size: 32 | |
The second epoch was trained with the following settings: | |
First buffer with 5000 forward passes through the network without updates | |
training episodes: 20000 | |
learning rate for Adam Optimizer: 9e-7 | |
we used our sinusoidal epsilon function with parameters: | |
starting epsilon: .2 | |
epsilon decay rate: .9997 | |
number of epsilon cycles: 4 | |
replay memory size: 75000 | |
batch size: 32 | |
More details about the network architecture, sinusoidal decay function and other can be found in our paper (coming soon). | |
Link to Poster: <http://web.stanford.edu/~chuchro3/projects/openaigym/OpenAiGymDeepLearningPoster.pdf> |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment