[B! optimization] ymym3412のブックマーク

ymym3412 id:ymym3412

optimizationに関するymym3412のブックマーク (4)

Optimization for Deep Learning Highlights in 2017
Optimization for Deep Learning Highlights in 2017 Different gradient descent optimization algorithms have been proposed in recent years but Adam is still most commonly used. This post discusses the most exciting highlights and most promising recent approaches that may shape the way we will optimize our models in the future. This post discusses the most exciting highlights and most promising direct
ymym3412 2017/12/11
optimization

Deep Learning

機械学習
リンク
Deep Learning for NLP Best Practices
Deep Learning for NLP Best Practices Neural networks are widely used in NLP, but many details such as task or domain-specific considerations are left to the practitioner. This post collects best practices that are relevant for most tasks in NLP. This post gives an overview of best practices relevant for most tasks in natural language processing. Update July 26, 2017: For additional context, the Ha
ymym3412 2017/07/26
NLP

Deep Learning

LSTM

attention

optimization
リンク
Tips for Training Recurrent Neural Networks
Tips for Training Recurrent Neural Networks Some practical tricks for training recurrent neural networks: Optimization Setup Adaptive learning rate. We usually use adaptive optimizers such as Adam (Kingma14) because they can better handle the complex training dynamics of recurrent networks that plain gradient descent. Gradient clipping. Print or plot the gradient norm to see its usual range, then
ymym3412 2017/07/06
deeplearning

rnn

optimization
リンク
The Marginal Value of Adaptive Gradient Methods in Machine Learning
Adaptive optimization methods, which perform local optimization with a metric constructed from the history of iterates, are becoming increasingly popular for training deep neural networks. Examples include AdaGrad, RMSProp, and Adam. We show that for simple overparameterized probl ems, adaptive methods often find drastically different solutions than gradient descent (GD) or stochastic gradient desc
ymym3412 2017/05/26
arxiv

Deep Learning

SGD

optimization
リンク
1

お知らせ

もっと読む

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

キーボードショートカット一覧

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx