Skip to content

Instantly share code, notes, and snippets.

@pyliaorachel
Created June 15, 2018 05:12
Show Gist options
  • Save pyliaorachel/33b1741b963309f813191fef86c1cf3d to your computer and use it in GitHub Desktop.
Save pyliaorachel/33b1741b963309f813191fef86c1cf3d to your computer and use it in GitHub Desktop.
OpenAI Gym CartPole - Deep Q-Learning (modify reward)
...
next_state, reward, done, info = env.step(action)
# 修改 reward,加快訓練
x, v, theta, omega = next_state
r1 = (env.x_threshold - abs(x)) / env.x_threshold - 0.8 # 小車離中間越近越好
r2 = (env.theta_threshold_radians - abs(theta)) / env.theta_threshold_radians - 0.5 # 柱子越正越好
reward = r1 + r2
dqn.store_transition(state, action, reward, next_state)
...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment