This document presents mathematical formulas for calculating gradients and updates in reinforcement learning. It defines a formula for calculating the gradient of a value function with respect to its parameters, a formula for calculating the gradient of a policy based on the reward and value, and a formula for calculating the gradient of a parameter vector that is a weighted combination of its pre
![モナドをつくろう](https://cdn-ak-scissors.b.st-hatena.com/image/square/bbfc78d70ca2775ffd4ddf923b59e2a72071c140/height=288;version=1;width=512/https%3A%2F%2Fcdn.slidesharecdn.com%2Fss_thumbnails%2Ffp2012-120903004837-phpapp02-thumbnail.jpg%3Fwidth%3D640%26height%3D640%26fit%3Dbounds)