Notes on AdaGrad Chris Dyer School of Computer Science Carnegie Mellon University 5000 Forbes Ave., Pittsburgh, PA, 15213 cdyer@cs.cmu.edu Abstract These are some notes on the adaptive (sub)gradient methods proposed by Duchi et al. (2011), a family of easy-to-implement techniques for online parameter learning with strong theoretical guarantees and widely attested empirical success. These notes ar