igrepのブックマーク - はてなブックマーク

igrep id:igrep

ブックマーク / ai.gopubby.com (1)

Unbelievable! Run 70B LLM Inference on a Single 4GB GPU with This NEW Technique
Large language models require huge amounts of GPU memory. Is it possible to run inference on a single GPU? If so, what is the minimum GPU memory required? The 70B large language model has parameter size of 130GB. Just loading the model into the GPU requires 2 A100 GPUs with 100GB memory each. During inference, the entire input sequence also needs to be loaded into memory for complex “attention” ca
igrep 2023/11/28
LLM
リンク
1

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx