Ryobotのブックマーク / 2020年4月24日 - はてなブックマーク

Ryobot id:Ryobot

2020年4月24日のブックマーク (1件)

Scaling Laws for Neural Language Models
We study empirical scaling laws for language model performance on the cross-entropy loss. The loss scales as a power-law with model size, dataset size, and the amount of compute used for training, with some trends spanning more than seven orders of magnitude. Other architectural details such as network width or depth have minimal effects within a wide range. Simple equations govern the dependence
Ryobot 2020/04/24
素晴らしい研究．パラメータ数が重要であり幅や深さは重要じゃない．lossが各変数のべき法則に従ってることを表すプロットに感心した． “The loss scales as a power-law with model size, dataset size, and the amount of compute used for training”

GPT-2

Scaling Law

Transformer
リンク
- 2020年4月29日
- 2020年4月24日
- 2020年4月18日

お知らせ

もっと読む

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

キーボードショートカット一覧

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx