dannのブックマーク / 2024年6月14日 - はてなブックマーク

dann id:dann

2024年6月14日のブックマーク (4件)

Physics of Language Models: Part 3.1, Knowledge Storage and Extraction
dann 2024/06/14
llm
リンク
DeepSpeed Meetup in Japan on May 23, 2024
dann 2024/06/14
deepspeed

deeplearning
リンク
SSII2024 [SS1] 拡散モデルの今　〜 2024年の研究動向〜
バスのサービスレベル向上と運賃策による熊本都市圏の渋滞緩和効果推計　～公共交通への公的投資に向けた感度と集計QVに基づく迅速なシナリオ検討～
dann 2024/06/14
ai

genai
リンク
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations
Scale has become a main ingredient in obtaining strong machine learning models. As a result, understanding a model's scaling properties is key to effectively designing both the right training setup as well as future generations of architectures. In this work, we argue that scale and training research has been needlessly complex due to reliance on the cosine schedule, which prevents training across
dann 2024/06/14
llm
リンク
- 2024年6月16日
- 2024年6月14日
- 2024年6月13日

お知らせ

もっと読む

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

キーボードショートカット一覧

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx