[B! performance] dannのブックマーク

dann id:dann

performanceに関するdannのブックマーク (1,136)

2.5. etcd についての推奨されるプラクティス | Red Hat Product Documentation
dann 2024/10/10
etcd

performance
リンク
GitHub - wolfpld/tracy: Frame profiler
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
dann 2024/09/30
performance
リンク
Pytorch Conference
dann 2024/09/19
pytorch

llm

performance
リンク
Netflixによるインスタンス負荷改善のための解析事例 - FPGA開発日記
LinkedInの記事をめぐっているうちに見つけた、マイクロアーキテクチャに関する面白い事例。 CPUのマイクロアーキテクチャのさらに奥深くまで理解が必要な問題を解決するために、どのようなツールをつかってどのように解決したかの話。 netflix tech blog.com Netflix内でのワークロード最適化のため、AWSのインスタンスサイズを移行(16 vCPUから48 vCPU)し、CPUがボトルネックとなるワークロードの性能向上を図った。このインスタンスの移行により、性能をほぼ直線的に増加させることを想定し、スループットがおよそ3倍になると予想した。しかし、結果としてこの移行で想定する性能は達成できなかった。 https://netflix tech blog.com/seeing-through-hardware-counters-a-journey-to-threefold-pe
dann 2024/08/22
netflix

performance
リンク
CUTLASS Tutorial: Mastering the NVIDIA® Tensor Memory Accelerator (TMA)
dann 2024/06/25
gpu

hopper

performance
リンク
NVIDIA Performance Libraries (NVPL)
dann 2024/05/09
nvpl

nvidia

performance
リンク
JVM Profiling in Action
dann 2024/04/21
java

performance
リンク
手軽に負荷テストができるツール「Taurus」がスゴい
modules: jmeter: version: 5.4.1 # ここに書いてあるバージョンを勝手にダウンロードしてくれる properties: log_level.JMeter: WARN log_level.JMeter.threads: WARN system-properties: org.apache.commons.logging.simplelog.log.org.apache.http: WARN 既存ツールのラッパーとして動作デフォルトでは内部的にJmeterが実行されますが、以下のようなツールで作成されたスクリプトを流用することが可能です。 JMeter Gatling Locust Selenium Vegeta つまり、さきほどはYAMLでシナリオが記述可能とは言いましたが、もちろん既存のスクリプトを流用できるってことです。いままで作り上げてきたスクリプトや
dann 2024/04/03
taurus

performance
リンク
プロファイラを使用した TensorFlow のパフォーマンス最適化 | TensorFlow Core
プロファイラを使用した TensorFlow のパフォーマンス最適化コレクションでコンテンツを整理必要に応じて、コンテンツの保存と分類を行います。このガイドでは、TensorFlow Profiler で提供されているツールを使用して、TensorFlow モデルのパフォーマンスを追跡する方法を説明します。また、ホスト（CPU）、デバイス（GPU）、またはホストとデバイスの両方の組み合わせでモデルがどのように機能するかを確認します。プロファイリングは、モデル内のさまざまな TensorFlow 演算（op）によるハードウェアリソース消費（時間とメモリ）を把握し、パフォーマンスのボトルネックを解消して最終的にモデルの実行を高速化するのに役立ちます。このガイドでは、プロファイラのインストール方法、利用可能なさまざまなツール、プロファイラのさまざまなパフォーマンスデータ収集モード、およ
dann 2024/02/01
tensorflow

profiling

performance
リンク
A guide to LLM inference and performance | Baseten Blog
We want to use the full power of our GPU during LLM inference. To do that, we need to know if our inference is compute bound or memory bound so that we can make optimizations in the right area. Calculating the operations per byte possible on a given GPU and comparing it to the arithmetic intensity of our model’s attention layers reveals where the bottleneck is: compute or memory. We can use this i
dann 2024/01/31
llm

performance
リンク
PytorchによるLLMの高速化
アドベントカレンダー「ほぼ横浜の民」の11日目の記事です。今年は LLM の高速化実装について書いています。私はLLMの専門家ではないですが前々から興味があったので少し勉強してみました。この記事を読んでわかること LLMが文章を生成する仕組み torch.compile によって LLM はどのように高速化されるのか？ Speculative Decoding とは？背景少し前に Accelerating Generative AI with Pytorch II: GPT, Fast という素晴らしいブログ記事を見かけました。この記事は Pytorch チームから出されたもので、素の Pytorch のみを用いて LLM の推論を 10 倍高速化できるというものでした。一体どのように 10 倍もの高速化を実現しているのか気になったので、個人的な勉強も兼ねてこの記事を書いています。
dann 2024/01/31
pytorch

performance
リンク
Accelerating Generative AI with PyTorch: Segment Anything 2 - Fast and furious inference with low latency and fast cold starts
Join us at PyTorch Conference in San Francisco, October 22-23. CFP open now! Learn more. Learn Get Started Run PyTorch locally or get started quickly with one of the supported cloud platforms Tutorials Whats new in PyTorch tutorials Learn the Basics Familiarize yourself with PyTorch concepts and modules PyTorch Recipes Bite-size, ready-to-deploy PyTorch code examples Intro to PyTorch - YouTube Ser
dann 2024/01/24
llm

inference

performance
リンク
Accelerating Generative AI Part III: Diffusion, Fast
by Sayak Paul and Patrick von Platen (Hugging Face 🤗) This post is the third part of a multi-series blog focused on how to accelerate generative AI models with pure, native PyTorch. We are excited to share a breadth of newly released PyTorch performance features alongside practical examples to see how far we can push PyTorch native performance. In part one, we showed how to accelerate Segment Any
dann 2024/01/05
performance

inference

pytorch
リンク
GitHub - pytorch-labs/segment-anything-fast: A batched offline inference oriented version of segment-anything
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
dann 2023/12/03
pytorch

performance
リンク
Fireworks - Fastest Inference for Generative AI
Faster, more efficient DeepSeek on the Fireworks AI Developer CloudDiscover how Fireworks AI Developer Cloud accelerates AI innovation with faster, optimized DeepSeek R1 deployments. Learn about new GPU options, improved speed, and enhanced developer tools for efficient, scala ble AI solutions.
dann 2023/12/02
performance

gpu

cudagraph
リンク
Accelerating Generative AI with PyTorch: Segment Anything, Fast
by Team PyTorch This post is the first part of a multi-series blog focused on how to accelerate generative AI models with pure, native PyTorch. We are excited to share a breadth of newly released PyTorch performance features alongside practical examples of how these features can be combined to see how far we can push PyTorch native performance. As announced during the PyTorch Developer Conference
dann 2023/12/02
pytorch

performance
リンク
Perfetto - System profiling, app tracing and trace analysis
sort Linux kernel tracing Capture high frequency ftrace data: scheduling activity, task switching latency, CPU frequency and much more nfc Userspace profilers and extra probes Native heap profiling, Java heap profiling, pollers for /proc stat files
dann 2023/12/02
performance
リンク
gpt-fast/eval.py at main · pytorch-labs/gpt-fast
dann 2023/12/02
pytorch

llm

performance
リンク
Understanding Performance — Dask documentation
dann 2023/10/15
dask

performance
リンク
GitHub - gradle/gradle-profiler: A tool for gathering profiling and benchmarking information for Gradle builds
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
dann 2023/09/17
gradle

java

performance
リンク
1 2 3 4 5 6 7 8 9 10 次のページ

お知らせ

もっと読む

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

キーボードショートカット一覧

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx