Ryobotのブックマーク / 2021年1月15日 - はてなブックマーク

Ryobot id:Ryobot

2021年1月15日のブックマーク (1件)

Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
In deep learning, models typically reuse the same parameters for all inputs. Mixture of Experts (MoE) defies this and instead selects different parameters for each incoming example. The result is a sparsely-activated model -- with outrageous numbers of parameters -- but a constant computational cost. However, despite several notable successes of MoE, widespread adoption has been hindered by comple
Ryobot 2021/01/15
Table 1とTable 10を見ると，同じ性能を得るのに必要な計算量はSwitch-Base/XXLがT5-Large/XXLに比べて約半分だが，まだ複雑さやメモリ要件を上回る旨味が薄いかな．1.5Tパラメータでゲインがないのも残念

Mixtures of Experts

Transformer
リンク
- 2021年1月19日
- 2021年1月15日
- 2021年1月2日

お知らせ

もっと読む

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

キーボードショートカット一覧

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx