Tips for Training Recurrent Neural Networks Some practical tricks for training recurrent neural networks: Optimization Setup Adaptive learning rate. We usually use adaptive optimizers such as Adam (Kingma14) because they can better handle the complex training dynamics of recurrent networks that plain gradient descent. Gradient clipping. Print or plot the gradient norm to see its usual range, then