[B! model] manboubirdのブックマーク

manboubird id:manboubird

modelに関するmanboubirdのブックマーク (100)

メタが桁違いのAIインフラ構築　「完全な汎用知能」へ - 日本経済新聞
米メタが人工知能（AI）向けのIT（情報技術）インフラ投資を再拡大し始めた。リストラのため2023年は設備投資などの資本的支出（CAPEX）を減らしていたが、24年は再び増やして300億〜370億ドル（約4.4兆〜5.5兆円）を投じる。マーク・ザッカーバーグ最高経営責任者（CEO）は「完全な汎用知能の実現を目指す」と明言している。前年よりも減ったものの、設備投資自体は23年も巨額だった。同社が
manboubird 2024/03/03
meta

generativeAi

ad

llm

nvidia

oss

model
リンク
日本語LLMにおけるトークナイザーの重要性 | データアナリティクスラボ
ELYZA-Japanese-Llama-2-7b ELYZA-Japanese-Llama-2-7bは、ELYZAが公開した日本語に特化したLLMです。公開されたのは以下のモデルです。 ELYZA-japanese-Llama-2-7bELYZA-japanese-Llama-2-7b-fastELYZA-japanese-Llama-2-7b-instructELYZA-japanese-Llama-2-7b-fast-instruct instruct：instruction tuningによって追加学習されたモデル。 fast：日本語の語彙の追加によって処理を高速化したモデル。モデルの概要 ELYZA-japanese-Llama-2-7bはLlama2をベースとして日本語処理の性能を向上させるために追加学習したモデルです。英語で学習済みのLLMの言語能力を引き継ぐことで、少な
manboubird 2024/02/12
llm

tokenizer

generativeAi

nlp

model

sentencePiece
リンク
A Comprehensive Survey of Compression Algorithms for Language Models
manboubird 2024/02/02
llm

generativeAi

compression

model

paper

survey
リンク
google/siglip-base-patch16-256-multilingual · Hugging Face
","pad_token":"","unk_token":""}},"discussionsDisabled":false,"downloads":710,"downloadsAllTime":2067,"id":"google/siglip-base-patch16-256-multilingual","isLikedByUser":false,"isWatchedByUser":false,"inference":"Yes","lastModified":"2024-01-19T23:32:28.000Z","likes":19,"pipeline_tag":"zero-shot-image-classification","library_name":"transf ormers","librariesOther":[],"model-index":null,"private":fal
manboubird 2024/01/31
siglip

google

model

llm

generativeAi

clip
リンク
Grok-1 Model Card by xAI
- 2 users
- x.ai
- 世の中
Grok-1 is an autoregressive Transf ormer-based model pre-trained to perform next-token prediction. The model was then fine-tuned using extensive feedback from both humans and the early Grok-0 models. The initial Grok-1 has a context length of 8,192 tokens and is released in Nov 2023. Grok-1 is intended to be used as the engine behind Grok for natural language processing tasks including question ans
manboubird 2023/11/11
grok

xAI

modelCard

model
リンク
生成AIの基盤技術を無償公開　東大・松尾教授の研究室 - 日本経済新聞
人工知能（AI）研究の第一人者として知られる東京大学の松尾豊教授の研究室は18日、生成AIの基盤技術である「大規模言語モデル（LLM）」を開発したと発表した。日本語と英語に対応し、外部の研究者らが利用できるよう無償で公開した。日本の生成AI活用を後押しする。　米オープンAIの「Chat（チャット）GPT」などのLLMは主に英語の文章を学習しているため、日本語よりも英語で生成される文章の方が精度
manboubird 2023/09/02
llm

model

tokyoUniversity
リンク
日本語画像言語モデル「Japanese InstructBLIP Alpha」をリリースしました — Stability AI Japan
Stability AIは日本語向け画像言語モデル「Japanese InstructBLIP Alpha」を一般公開しました。入力した画像に対して文字で説明を生成できる画像キャプション機能に加え、画像についての質問を文字で入力することで回答することもできます。 Japanese InstructBLIP Alpha「Japanese InstructBLIP Alpha」は、先日公開された日本語向け指示応答言語モデル「Japanese Stabl eLM Instruct Alpha 7B」を拡張した、画像を元にしたテキストが生成されるモデルです。「Japanese InstructBLIP Alpha」は、高いパフォーマンスが報告されている画像言語モデルInstructBLIPのモデル構造を用いております。少ない日本語データセットで高性能なモデルを構築するために、モデルの一部を大規模な
manboubird 2023/08/21
llm

generativeAi

stabilityAi

model
リンク
ChatGPT（3.5）に匹敵する「Llama 2」をローカルPCで動かしてみた
生成AIのトップランナーといえば、米OpenAIが提供するGPT-4などを使ったChatGPTですが、その対抗馬として期待されているのが米Metaが提供する大規模言語モデル「Llama 2」です。このLlama 2、GPT-3.5の3月1日時点のモデルに匹敵する性能を持っているというのがウリです。GPT-3.5といえば、無料版のChatGPTで使われているモデルです。それがオープンソースとして公開されたのですから、衝撃的です。さらに、高性能なだけでなくモデルサイズが小さいことも特徴です。GPT-3のパラメータ数は1750億（175B）、GPT-3.5は未公開ではあるものの3550億（355B）と推定されています。一方で、Llama 2は、700億（70B）パラメータで、GPT-3.5並をうたっています。パラメータが小さくなれば必要なGPUのメモリも小さくなります。GPT-3.5はデー
manboubird 2023/08/21
llama

model

llm
リンク
最近の話題にも詳しい14億パラメータの日本語LLMの公開
Research部門の近江崇宏です。今回、ストックマークは最近の話題にも詳しいGPT-NeoXをベースとした14億パラメータの日本語のLLM（大規模言語モデル）をオープンソースとして公開します。モデルはHugging Face Hubからダウンロードいただけます。 https://huggingface.co/stockmark/gpt-neox-japanese-1.4b 当社はビジネスにおける情報収集・分析をサポートするサービスを運営しており、そのために最新のWebデータの収集を日々行なっております。今回の事前学習では、一般にLLMの事前学習によく使われるCommon Crawl由来のデータだけでなく、当社が所有している独自のWebデータ（2023年6月まで）も含めて事前学習を行うことで、最近の話題にも詳しいモデルを開発しました。具体的には、事前学習に用いたデータセットはCC100の
manboubird 2023/08/09
stockmark

llm

model

japanese
リンク
The Falcon has landed in the Hugging Face ecosystem
Falcon is a new family of state-of-the-art language models created by the Techno logy Innovation Institute in Abu Dhabi, and released under the Apache 2.0 license. Notably, Falcon-40B is the first “truly open” model with capabilities rivaling many current closed-source models. This is fantastic news for practitioners, enthusiasts, and industry, as it opens the door for many exciting use cases. Note
manboubird 2023/08/07
llm

falcon

model

huggingface
リンク
GitHub - karpathy/llama2.c: Inference Llama 2 in one file of pure C
Have you ever wanted to inference a baby Llama 2 model in pure C? No? Well, now you can! Train the Llama 2 LLM architecture in PyTorch then inference it with one simple 700-line C file (run.c). You might think that you need many billion parameter LLMs to do anything useful, but in fact very small LLMs can have surprisingly strong performance if you make the domain narrow enough (ref: TinyStories p
manboubird 2023/07/27
llama

model
リンク
東大発スタートアップ、67億パラメーターの日本語LLMをOSSで公開
東京大学発のスタートアップ企業であるLightblue（ライトブルー）は、公開モデルとしては国内最大規模の67億パラメーターの日本語大規模言語モデルを開発し、オープンソース・ソフトウェアとして公開した。ライセンスはApache 2.0。この言語モデルは、米モザイクML （MosaicML）が公開した多言語大規模言語モデル「MPT-7B」を基にしたもの。グーグルが開発した多言語データセット「MC4」をアレン人工知能研究所（Allen Institute for AI）がそれぞれの言語ごとに利用可能にしたサブセットの日本語部分を使って追加学習した。 Lightblueは、今回公開したモデルを法人向けに提供する。業界用語や部署特有の専門用語、慣習などに合わせて訓練・調整することで、企業や部署によって異なる要望に応じるという。加えて、自社サービスの提供も予定しているとのことだ。（笹田） 6人気
manboubird 2023/07/26
oss

llm

model

startup
リンク
lightblue/japanese-mpt-7b · Hugging Face
Dataset Japanese subset of the mC4 dataset Training Trained for 3000 steps on top of the MPT 7b checkpoint mosaicml/mpt-7b How to load Before running this model, please install the following pip package: pip install einops To load the model, run the following command. from transf ormers import AutoModelForCausalLM model_name = "lightblue/japanese-mpt-7b" model = AutoModelForCausalLM.from_pretrained
manboubird 2023/07/26
llm

japanese

model
リンク
Llama
Llama is the next generation of our open source large language model, available for free for research and commercial use.
manboubird 2023/07/22
llama

model

meta

artificialIntelligence

llm
リンク
GitHub - shi3z/peft_pretraining
manboubird 2023/07/16
llm

training

model

llamaIndex

LoRA
リンク
これぞ革命!?ゼロから大規模言語モデルを学習できるReLORA登場(7/18追記あり)｜shi3z
導入　本当に革命的な技術なのか? 「君たちはどう生きるか」で驚いている間にすごい論文が世界の話題を掻っ攫っていた。その名も「ReLORA」簡単に言えば、「事前学習にLoRAを使う」というものである。これは本当に革命的な発見かもしれないので、僕の仮説も含めて丁寧に説明する。まず、大前提として、「LoRA」という技術について LoRAは、「Low Rank Adaptation(日本語で言うとすれば低階適応)」という技術で、これまでは主にファインチューニングに使われてきた。ファインチューニングとは、あらかじめ学習されたニューラルネットワークに対して追加で学習させ、概念を強調させたり新しく覚えさせたりする。たとえば、僕の顔でStableDiffusionをファインチューニングすれば、僕みたいな顔の絵がどんどん出てくる。言語モデルにおけるLoRAも同様で、新しい概念や「こういうやりとり
manboubird 2023/07/16
LoRA

model

generativeAi

llm

paper
リンク
NEC、日本市場向け生成AIを開発・提供開始
NECは、Generative AI(生成AI)による産業の変化に合わせた日本企業の新しい企業価値創造への挑戦に向けて、お客様に合わせてカスタマイズ可能な生成AIを開発し、LLM(Large Language Model:大規模言語モデル)のライセンスから日本市場のニーズに合わせた専用ハードウェア、ソフトウェア、コンサルティングサービスなどを提供する「NEC Generative AI Service」を今月から順次提供を開始します。また、NECの知見とお客様のナレッジを連携させ、お客様と共に、そのお客様向けのモデル作成や、LLM活用のためのソフトウェア整備、組織立ち上げなどを包括的に支援するお客様向けプログラム「NEC Generative AI Advanced Customer Program」を約10の企業・大学と共に立ち上げました。なお、研究者やAIへの指示を的確に行うプロン
manboubird 2023/07/07
generativeAi

model

llm
リンク
https://www.databricks.com/blog/category/generative-ai/mosaic-research
manboubird 2023/06/26
llm

model

mpt30b
リンク
sonoisa/sentence-t5-base-ja-mean-tokens · Hugging Face
manboubird 2023/06/22
sentenceT5

model
リンク
LLaMA: Open and Efficient Foundation Language Models
We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publ icly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is co
manboubird 2023/06/19
llama

model

llm

meta
リンク
1 2 3 4 5 次のページ

お知らせ

もっと読む

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

キーボードショートカット一覧

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx