arrowKatoのブックマーク - はてなブックマーク

nvidia/Nemotron-4-340B-Instruct · Hugging Face

arrowKato 2024/06/17

Nvidia製の340Bのクソでかモデル。その分性能もいい。BF16 推論: 8x H200 (1x H200 ノード) 16x H100 (2x H100 ノード) 16x A100 80GB (2x A100 80GB ノード)

LLM

リンク

Qwen/Qwen2-72B-Instruct · Hugging Face

arrowKato 2024/06/17

LLM

リンク

HODACHI/glm-4-9b-chat-FT-ja-v0.3 · Hugging Face

arrowKato 2024/06/17

7B帯の日本語で強いモデル

LLM

リンク

Qwen/Qwen2-7B-Instruct · Hugging Face

arrowKato 2024/06/11

リンク

rinna/llama-3-youko-8b · Hugging Face

","eos_token":"<|end_of_text|>"}},"createdAt":"2024-05-01T07:53:45.000Z","discussionsDisabled":false,"downloads":1359,"downloadsAllTime":1359,"id":"rinna/llama-3-youko-8b","isLikedByUser":false,"isWatchedByUser":false,"inference":"ExplicitOptOut","lastModified":"2024-05-07T01:59:47.000Z","likes":28,"pipeline_tag":"text-generation","library_name":"transf ormers","librariesOther":[],"model-index":nul

arrowKato 2024/05/31

Llama3 の日本語で継続事前学習

LLM

リンク

https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T

arrowKato 2024/05/21

LLMのデータセット

リンク

kunishou/databricks-dolly-15k-ja · Datasets at Hugging Face

This dataset was created by automatically translating "databricks-dolly-15k" into Japanese. This dataset is licensed under CC-BY-SA-3.0 Last Update : 2023-05-11 databricks-dolly-15k-ja https://github.com/kunishou/databricks-dolly-15k-ja databricks-dolly-15k https://github.com/databrickslabs/dolly/tree/master/data

arrowKato 2024/05/21

LLM用の日本語データセット

リンク

mlabonne/Meta-Llama-3-120B-Instruct · Hugging Face

","chat_template":"{% set loop_messages = messages %}{% for message in loop_messages %}{% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' %}{% if loop.index0 == 0 %}{% set content = bos_token + content %}{% endif %}{{ content }}{% endfor %}{% if add_generation_prompt %}{{ '<|start_header_id|>assistant<|end_header_id|>\n\n' }

arrowKato 2024/05/14

浅い層は大体浅い層でコピーして、真ん中くらいはだいたい真ん中くらいの層をコピーして、深い層も同様にしてモデルを大きくしたら精度が上がっちゃった

LLM

リンク

microsoft/deberta-v3-base · Hugging Face

arrowKato 2024/03/11

2023年MLコンペで自然言語処理の1位の人たちが最もよく使ったモデル。https://mlcontests.com/state-of-competitive-machine-learning-2023/

リンク

Preference Tuning LLMs with Direct Preference Optimization Methods

arrowKato 2024/01/23

リンク

tokyotech-llm/Swallow-7b-hf · Hugging Face

arrowKato 2023/12/20

llama2ベースの日本語が強いモデル。2023/12/19リリースとのこと

LLM

リンク

LMSys Chatbot Arena Leaderboard - a Hugging Face Space by lmsys

Discover amazing ML apps made by the community

arrowKato 2023/12/18

OSS, ClosedなLLMの性能比較。

LLM

リンク

MTEB Leaderboard - a Hugging Face Space by mteb

Discover amazing ML apps made by the community

arrowKato 2023/11/30

retrieval でいいモデルを探すときに使おう

リンク

GAIA Leaderboard - a Hugging Face Space by gaia-benchmark

Discover amazing ML apps made by the community

arrowKato 2023/11/24

エージェントの評価データセットのリーダーボード

LLM

リンク

stabilityai/japanese-stablelm-base-gamma-7b · Hugging Face

Japanese Stable LM Base Gamma 7B Model Description This is a 7B-parameter decoder-only language model with a focus on maximizing Japanese language modeling performance and Japanese downstream task performance. We conducted continued pretraining using Japanese data on the English language model, Mistral-7B-v0.1, to transfer the model's knowledge and capabilities to Japanese. If you are looking for

arrowKato 2023/11/01

日本語モデル

LLM

リンク

rinna/youri-7b · Hugging Face

arrowKato 2023/11/01

日本語モデル

LLM

リンク

izumi-lab/llm-japanese-dataset · Datasets at Hugging Face

The full dataset viewer is not available (click to read why). Only showing a preview of the rows.

arrowKato 2023/10/24

SFT用の日本語の公開データセット。

LLM
SFT

リンク

Fine-tune Llama 2 with DPO

arrowKato 2023/08/22

LLM

リンク

The Technology Behind BLOOM Training

Please note that both Megatron-LM and DeepSpeed have Pipeline Parallelism and BF16 Optimizer implementations, but we used the ones from DeepSpeed as they are integrated with ZeRO. Megatron-DeepSpeed implements 3D Parallelism to allow huge models to train in a very efficient way. Let’s briefly discuss the 3D components. DataParallel (DP) - the same setup is replicated multiple times, and each being

arrowKato 2023/06/13

学習

LLM
BLOOM

リンク

イントロダクション - Hugging Face NLP Course

🤗 コースへようこそ! このコースでは、Hugging Faceのエコシステムを形成するライブラリである🤗 Transf ormers、🤗 Datasets、🤗 Tokenizers、🤗 Accelerate、そしてHugging Face Hubを使って自然言語処理（NLP）について学習することができます。このコースは、完全に無料で取り組むことができ、広告もありません。何を学ぶことができるのか？こちらがこのコースの概要になります: 第１章から第４章では、🤗 Transf ormersライブラリのメインコンセプトを紹介します。このパートを終える頃には、Transf ormerモデルがどのように機能するかを理解し、Hugging Face Hubにあるモデルを利用し、データセットでfine-tuningを行い、その成果をHub上で共有する方法を身につけることができるでしょう！第５

arrowKato 2022/11/15

これが欲しかった。サンプルコードは動くかしら

リンク

はてなブックマーク

タグ

ブックマーク / huggingface.co (20)

お知らせ

今週のはてなブックマーク数ランキング（2024年8月第1週）

月間はてなブックマーク数ランキング（2024年7月）

今週のはてなブックマーク数ランキング（2024年7月第4週）

公式Twitter

キーボードショートカット一覧

はてなブックマーク

公式Twitter

はてなのサービス