2020-01-01から1ヶ月間の記事一覧
前回↓ ryosuke-okubo.hatenablog.com 96 PPO(2017) 原文: Proximal Policy Optimization Algorithms Abstract: We propose a new family of policy gradient methods for reinforcement learning, which alternate between sampling data through intera…
前回↓ ryosuke-okubo.hatenablog.com 91 Ape-X(2018) 原文: Distributed Prioritized Experience Replay Abstract: We propose a distributed architecture for deep reinforcement learning at scale, that enables agents to learn effectively from o…
前回↓ ryosuke-okubo.hatenablog.com 86~100は強化学習について扱う。 参考: https://qiita.com/shionhonda/items/ec05aade07b5bea78081 86 Deep Q-Network(2013) 原文: Playing Atari with Deep Reinforcement Learning Abstract: We present the firs…