[B! moe] dannのブックマーク

A Survey on Mixture of Experts

dann 2024/07/10

llm
moe

リンク

Perfecting Merge-kit MoE's

Perfecting Merge-kit MoE’s:� We don't need Mixtral-instructs “secret sauce” A write up by: Rombodawg Introduction: (Note: Please stop suggesting edits, Comments enabled for AI merging using MergKit, MoE creation, and paper feedback.)� First of all I’d like to point out the elephant in the room… ...

dann 2024/04/25

リンク

Scaling Laws for Fine-Grained Mixture of Experts

dann 2024/02/13

moe
llm

リンク

学習済みの LLM を束ねて Mixture of Experts を作るテク

導入 Twitter でこんな投稿を見かけました。「Phi-2 ベースのモデルをいくつか使って Mixture of Experts (MoE) を作ったら単体よりも良い性能が達成できました」という話です。学習済み LLM をマージするテクに関しては最近時々話題に上がっているのを見かけますが、MoE には Gating 部分で追加のパラメータが必要なはずで、そこはどうやっているんだろうと気になりました。中身を見てみたところ、Few-shot で Gating のパラメータを決める手法が使われていて面白かったので、それについて書いてみます。 Sparse Mixture of Experts (Sparse MoE) の推論時の処理 Phixtral は名前やワードアートからも分かる通り、Mixtral の Sparse MoE を踏襲しているので、まずその推論時の処理について書きます。

dann 2024/02/04

llm
moe

リンク

Mergekitを使ったMoE(Mixture of Experts)作成のテクニック｜はち

はじめに以前行ったMergekitを使った日本語MoEの作成（以下記事）がそこそこ上手くいったものの、かなり手探りで実施した感があった。そんな中、有志でベストプラクティスがまとめられているのを知ったのでその要約を自分のためにここにまとめようと思う。忙しい方は4. まとめだけ読めば雰囲気は確認できると思う。 1. 概要章の構成は以下の通り。 Overview What makes a perfect MoE: The secret formula Using the same exact model together 4x or 8x or (etc) times is pointless Why is a proper merge considered a base model, and how do we distinguish them from a FrankenMoE? Wh

dann 2024/02/04

moe
llm

リンク

Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy

dann 2024/01/17

llm
moe

リンク

Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer

The capacity of a neural network to absorb information is limited by its number of parameters. Conditional computation, where parts of the network are active on a per-example basis, has been proposed in theory as a way of dramatically increasing model capacity without a proportional increase in computation. In practice, however, there are significant algorithmic and performance challenges. In this

dann 2024/01/15

llm
moe

リンク

はてなブックマーク

タグ

関連タグで絞り込む (2)

moeに関するdannのブックマーク (7)

お知らせ

今週のはてなブックマーク数ランキング（2024年9月第4週）

今週のはてなブックマーク数ランキング（2024年9月第3週）

今週のはてなブックマーク数ランキング（2024年9月第2週）

公式Twitter

キーボードショートカット一覧

はてなブックマーク

公式Twitter

はてなのサービス