dannのブックマーク - はてなブックマーク

dann id:dann

ブックマーク / huggingface.co (23)

Quanto: a pytorch quantization toolkit
Quantization is a technique to reduce the computational and memory costs of evaluating Deep Learning Models by representing their weights and activations with low-precision data types like 8-bit integer (int8) instead of the usual 32-bit floating point (float32). Reducing the number of bits means the resulting model requires less memory storage, which is crucial for deploying Large Language Models
dann 2024/03/20
pytorch
リンク
Make LLM Fine-tuning 2x faster with Unsloth and 🤗 TRL
Unsloth was benchmarked across 59 runs using 4 datasets on Tesla T4 and A100 Google Colab instances. QLoRA was applied to all linear layers (attention and MLP) with a rank of 16, and gradient checkpointing was on. By testing against the latest Transf ormers version (4.36), which has SDPA natively integrated if you have Pytorch 2.1.1, Unsloth is up to 2.7x faster and uses up to 74% less memory. We a
dann 2024/01/11
ft

llm
リンク
Distributed Inference with 🤗 Accelerate
dann 2024/01/04
accelerate

inference
リンク
Large-scale Near-deduplication Behind BigCode
dann 2023/10/14
deduplication

dask

spark
リンク
Paper page - Optimized Network Architectures for Large Language Model Training with Billions of Parameters
Abstract This paper challenges the well-established paradigm for building any-to-any networks for training Large Language Models (LLMs). We show that LLMs exhibit a unique communication pattern where only small groups of GPUs require high-bandwidth any-to-any communication within them, to achieve near-optimal training performance. Across these groups of GPUs, the communication is insignificant, sp
dann 2023/09/29
network

llm
リンク
Optimizing your LLM in production
Note: This blog post is also available as a documentation page on Transf ormers. Large Language Models (LLMs) such as GPT3/4, Falcon, and LLama are rapidly advancing in their ability to tackle human-centric tasks, establishing themselves as essential tools in modern knowledge-based industries. Deploying these models in real-world tasks rem ains challenging, however: To exhibit near-human text unders
dann 2023/09/16
llm
リンク
Paper page - Full Parameter Fine-tuning for Large Language Models with Limited Resources
dann 2023/06/20
llm
リンク
Paper page - Orca: Progressive Learning from Complex Explanation Traces of GPT-4
dann 2023/06/07
llm
リンク
Open LLM Leaderboard - a Hugging Face Space by HuggingFaceH4
Track, rank and evaluate open LLMs and chatbots
dann 2023/05/30
llm

hugginface
リンク
Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA
LLMs are known to be large, and running or training them in consumer hardware is a huge challenge for users and accessibility. Our LLM.int8 bl ogpost showed how the techniques in the LLM.int8 paper were integrated in transf ormers using the bitsandbytes library. As we strive to make models even more accessible to anyone, we decided to collaborate with bitsandbytes again to allow users to run models
dann 2023/05/25
llm

huggingface

bitsandbytes
リンク
PEFT
dann 2023/05/06
peft
リンク
replit/replit-code-v1-3b · Hugging Face
dann 2023/05/04
llm
リンク
nvidia/GPT-2B-001 · Hugging Face
GPT-2B-001 ||| Model Description GPT-2B-001 is a transf ormer-based language model. GPT refers to a class of transf ormer decoder-only models similar to GPT-2 and 3 while 2B refers to the total trainable parameter count (2 Billion) [1, 2]. This model was trained on 1.1T tokens with NeMo. Model Architecture improvements The model uses the SwiGLU activation function [4] Rotary positional embeddings (R
dann 2023/05/01
nvidia
リンク
Distributed training with 🤗 Accelerate
dann 2023/04/07
accelerate
リンク
From PyTorch DDP to Accelerate to Trainer, mastery of distributed training with ease
dann 2023/04/07
accelerate

ddp

huggingface
リンク
timm
dann 2023/02/23
huggingfacae

timm
リンク
A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes
The 3 models are BLOOM-176B, T5-11B and T5-3B. Hugging Face transf ormers integration nuances Next let's discuss the specifics of the Hugging Face transf ormers integration. Let's look at the usage and the common culprit you may encounter while trying to set things up. Usage The module responsible for the whole magic described in this blog post is called Linear8bitLt and you can easily import it fro
dann 2023/02/08
huggingface

transformer
リンク
Introduction to Graph Machine Learning
In this blog post, we cover the basics of graph machine learning. We first study what graphs are, why they are used, and how best to represent them. We then cover briefly how people learn on graphs, from pre-neural methods (exploring graph features at the same time) to what are commonly called Graph Neural Networks. Lastly, we peek into the world of Transf ormers for graphs. Graphs What is a graph?
dann 2023/01/10
graphgpt, tokengt

huggingface

gnn

transformer
リンク
DeepSpeed
dann 2022/08/30
deepspeed

huggingface

performance
リンク
Tf Xla Generate Benchmarks - a Hugging Face Space by joaogante
dann 2022/07/31
tensorflow
リンク
1 2 次のページ

お知らせ

もっと読む

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

キーボードショートカット一覧

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx