इसका टेक्स्ट मैसेज भेजे: Deep reinforcement learning hands-on : apply modern RL methods, with deep Q-networks, value iteration, policy gradients, TRPO, AlphaGo Zero and more / Maxim Lapan.