dannのブックマーク - はてなブックマーク

dann id:dann

ブックマーク / pytorch.org (45)

FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision
Attention, as a core layer of the ubiquitous Transf ormer architecture, is a bottleneck for large language models and long-context applications. FlashAttention (and FlashAttention-2) pioneered an approach to speed up attention on GPUs by minimizing memory reads/writes, and is now used by most libraries to accelerate Transf ormer training and inference. This has contributed to a massive increase in L
dann 2024/07/14
flashattention

llm
リンク
Reducing Model Checkpointing Times by Over 10x with PyTorch Distributed Asynchronous Checkpointing
dann 2024/06/13
pytorch
リンク
ExecuTorch Alpha: Taking LLMs and AI to the Edge with Our Community and Partners
dann 2024/05/05
llm

pytorch
リンク
torchtune: Easily fine-tune LLMs using PyTorch
by Team PyTorch We’re pleased to announce the alpha release of torchtune, a PyTorch-native library for easily fine-tuning large language models. Staying true to PyTorch’s design principles, torchtune provides composable and modular building blocks along with easy-to-extend training recipes to fine-tune popular LLMs on a variety of consumer-grade and professional GPUs. torchtune supports the full f
dann 2024/04/17
pytorch
リンク
PyTorch 2: Faster Machine Learning Through Dynamic Python Bytecode Transformation and Graph Compilation
dann 2024/02/13
pytorch
リンク
Welcome to the TensorDict Documentation! — tensordict main documentation
dann 2024/02/13
pytorch

tensordict
リンク
Accelerating Generative AI with PyTorch II: GPT, Fast
This post is the second part of a multi-series blog focused on how to accelerate generative AI models with pure, native PyTorch. We are excited to share a breadth of newly released PyTorch performance features alongside practical examples to see how far we can push PyTorch native performance. In part one, we showed how to accelerate Segment Anything over 8x using only pure, native PyTorch. In this
dann 2024/01/24
llm

inference

performance
リンク
Reproducibility — PyTorch 2.3 documentation
Reproducibility¶ Completely reproducible results are not guaranteed across PyTorch releases, individual commits, or different platforms. Furthermore, results may not be reproducible between CPU and GPU executions, even when using identical seeds. However, there are some steps you can take to limit the number of sources of nondeterministic behavior for a specific platform, device, and PyTorch relea
dann 2024/01/22
pytorch
リンク
Distributed communication package - torch.distributed — PyTorch 2.3 documentation
Learn Get Started Run PyTorch locally or get started quickly with one of the supported cloud platforms Tutorials Whats new in PyTorch tutorials Learn the Basics Familiarize yourself with PyTorch concepts and modules PyTorch Recipes Bite-size, ready-to-deploy PyTorch code examples Intro to PyTorch - YouTube Series Master PyTorch basics with our engaging YouTube tutorial series
dann 2024/01/15
pytorch
リンク
Accelerating Generative AI Part III: Diffusion, Fast
dann 2024/01/05
performance

inference

pytorch
リンク
PyTorch compile to speed up inference on Llama 2
dann 2023/12/02
pytorch
リンク
Accelerating Generative AI with PyTorch: Segment Anything, Fast
This post is the first part of a multi-series blog focused on how to accelerate generative AI models with pure, native PyTorch. We are excited to share a breadth of newly released PyTorch performance features alongside practical examples of how these features can be combined to see how far we can push PyTorch native performance. As announced during the PyTorch Developer Conference 2023, the PyTorc
dann 2023/12/02
pytorch

performance
リンク
PyTorch Profiler With TensorBoard — PyTorch Tutorials 2.3.0+cu121 documentation
PyTorch Recipes See All Recipes See All Prototype Recipes Introduction to PyTorch Learn the Basics Quickstart Tensors Datasets & DataLoaders Transf orms Build the Neural Network Automatic Differentiation with torch.autograd Optimizing Model Parameters Save and Load the Model Introduction to PyTorch on YouTube Introduction to PyTorch - YouTube Series Introduction to PyTorch Introduction to PyTorch T
dann 2023/05/25
tensorboard

pytorch
リンク
Efficient Large-Scale Training with Pytorch FSDP and AWS
dann 2023/05/09
fsdp

pytorch
リンク
Advanced Model Training with Fully Sharded Data Parallel (FSDP) — PyTorch Tutorials 2.2.1+cu121 documentation
dann 2023/05/09
fsdp

bf16
リンク
Accelerated PyTorch 2 Transformers
dann 2023/05/07
pytorch2

transformer
リンク
Visualizing Models, Data, and Training with TensorBoard — PyTorch Tutorials 2.3.0+cu121 documentation
Visualizing Models, Data, and Training with TensorBoard¶ In the 60 Minute Bl itz, we show you how to load in data, feed it through a model we define as a subclass of nn.Module, train this model on training data, and test it on test data. To see what’s happening, we print out some statistics as the model is training to get a sense for whether training is progressing. However, we can do much better t
dann 2023/04/15
tensorboard
リンク
TorchDynamo Overview — PyTorch master documentation
dann 2023/02/07
pytorch

dynamo
リンク
PyTorch 2.0
Get Started Select preferences and run the command to install PyTorch locally, or get started quickly with one of the supported cloud platforms. Overview Introducing PyTorch 2.0, our first steps toward the next generation 2-series release of PyTorch. Over the last few years we have innovated and iterated from PyTorch 1.0 to the most recent 1.13 and moved to the newly formed PyTorch Foundation, par
dann 2023/01/23
pytorch
リンク
torch.nn.modules.module — PyTorch 2.3 documentation
dann 2022/09/13
performance

pytorch
リンク
1 2 3 次のページ

お知らせ

もっと読む

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

キーボードショートカット一覧

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx