saitodevel01のブックマーク - はてなブックマーク

saitodevel01 id:saitodevel01

saitodevel01のブックマーク (1,814)

VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models
Scaling model size significantly challenges the deployment and inference of Large Language Models (LLMs). Due to the redundancy in LLM weights, recent research has focused on pushing weight-only quantization to extremely low-bit (even down to 2 bits). It reduces memory requirements, optimizes storage costs, and decreases memory bandwidth needs during inference. However, due to numerical representa
saitodevel01 2024/10/06
LLM
リンク
GitHub - rasbt/LLMs-from-scratch: Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
saitodevel01 2024/10/06
LLM
リンク
Full-Order Sampling-Based MPC for Torque-Level Locomotion Control via Diffusion-Style Annealing
- 1 user
- arxiv.org
- 学び
saitodevel01 2024/09/27
Robotics
リンク
Adjoint Matching: Fine-tuning Flow and Diffusion Generative Models with Memoryless Stochastic Optimal Control
saitodevel01 2024/09/20
拡散モデル
リンク
FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision
saitodevel01 2024/09/14
LLM
リンク
vAttention: Dynamic Memory Management for Serving LLMs without PagedAttention
saitodevel01 2024/09/14
LLM
リンク
Efficient Memory Management for Large Language Model Serving with PagedAttention
saitodevel01 2024/09/14
LLM
リンク
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
- 1 user
- arxiv.org
- 学び
saitodevel01 2024/09/08
structured matrices
リンク
Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers
- 1 user
- arxiv.org
- 学び
saitodevel01 2024/09/08
structured matrices
リンク
Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture
- 1 user
- arxiv.org
- 学び
saitodevel01 2024/09/08
structured matrices
リンク
Monarch: Expressive Structured Matrices for Efficient and Accurate Training
saitodevel01 2024/09/08
structured matrices
リンク
MoRe Fine-Tuning with 10x Fewer Parameters
- 2 users
- arxiv.org
- 学び
Parameter-efficient fine-tuning (PEFT) techniques have unlocked the potential to cheaply and easily specialize large pretrained models. However, the most prominent approaches, like low-rank adapters (LoRA), depend on heuristics or rules-of-thumb for their architectural choices -- potentially limiting their performance for new models and architectures. This limitation suggests that techniques from
saitodevel01 2024/09/08
structured matrices
リンク
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B
This paper introduces the MCT Self-Refine (MCTSr) algorithm, an innovative integration of Large Language Models (LLMs) with Monte Carlo Tree Search (MCTS), designed to enhance performance in complex mathematical reasoning tasks. Addressing the challenges of accuracy and reliability in LLMs, particularly in strategic and mathematical reasoning, MCTSr leverages systematic exploration and heuristic s
saitodevel01 2024/07/14
LLM
リンク
Simplified and Generalized Masked Diffusion for Discrete Data
- 1 user
- arxiv.org
- 学び
saitodevel01 2024/06/19
拡散モデル
リンク
HOT3D Dataset
saitodevel01 2024/06/19
Robotics
リンク
Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution
- 1 user
- arxiv.org
- 学び
saitodevel01 2024/06/13
拡散モデル
リンク
Robust Training of Vector Quantized Bottleneck Models
- 1 user
- arxiv.org
- 学び
saitodevel01 2024/06/10
VAE
リンク
Diffusion bridges vector quantized Variational AutoEncoders
- 1 user
- arxiv.org
- 学び
saitodevel01 2024/06/10
VAE
リンク
A Continuous Time Framework for Discrete Denoising Models
- 1 user
- arxiv.org
- 学び
saitodevel01 2024/06/09
A Continuous Time Framework for Discrete Denoising Models

拡散モデル
リンク
Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with Applications to Protein Co-Design
- 2 users
- arxiv.org
- 学び
Combining discrete and continuous data is an important capability for generative models. We present Discrete Flow Models (DFMs), a new flow-based model of discrete data that provides the missing link in enabling flow-based generative models to be applied to multimodal continuous and discrete data probl ems. Our key insight is that the discrete equivalent of continuous space flow matching can be rea
saitodevel01 2024/06/09
拡散モデル
リンク
1 2 3 4 5 6 7 8 9 10 次のページ

お知らせ

もっと読む

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

キーボードショートカット一覧

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx