Skip to content

Instantly share code, notes, and snippets.

@pyliaorachel
Created June 15, 2018 05:07
Show Gist options
  • Save pyliaorachel/ea325152ae815beb143a79bd46391339 to your computer and use it in GitHub Desktop.
Save pyliaorachel/ea325152ae815beb143a79bd46391339 to your computer and use it in GitHub Desktop.
OpenAI Gym CartPole - Deep Q-Learning (dqn choose action)
def choose_action(self, state):
x = torch.unsqueeze(torch.FloatTensor(state), 0)
# epsilon-greedy
if np.random.uniform() < self.epsilon: # 隨機
action = np.random.randint(0, self.n_actions)
else: # 根據現有 policy 做最好的選擇
actions_value = self.eval_net(x) # 以現有 eval net 得出各個 action 的分數
action = torch.max(actions_value, 1)[1].data.numpy()[0] # 挑選最高分的 action
return action
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment