タイトル「language-model」を検索 - はてなブックマーク

1 - 40 件 / 121件

新着順人気順

絞り込み

検索対象
ブックマーク数
期間
セーフサーチ

language-modelの検索結果1 - 40 件 / 121件

LMQL(Language Model Query Language)概観｜mah_lab / 西見公宏
- 52 users
- note.com/mahlab
- テクノロジー
- 2023/04/21
LMQL Playgroundでクエリを試すLMQLには動作を簡単に検証できるPlaygroundが用意されています。ローカルでPlaygroundを起動することもできます。まずはGetting Startedで紹介されている以下のクエリを実行します。 argmax "Hello[WHO]" from "openai/text-ada-001" where len(WHO) < 10「Run」ボタンをクリックするとOpenAIのAPI KEYを求められるので、入力します。実行するとModel Responseの枠に結果が表示されます。 LMQLの基本構造LMQLは記法的にはSQLと似ていて、以下のような構造を持っています。デコーダ節（Decoder Clause）：テキスト生成に使用するデコード・アルゴリズムを指定します。LMQLでは様々なデコード・アルゴリズムを選択することができ
- AI
- LLM
- NLP
- あとで読む
- 文章
Introducing Code Llama, a state-of-the-art large language model for coding
- 20 users
- ai.meta.com
- テクノロジー
- 2023/08/24
Today, we are releasing Code Llama, a large language model (LLM) that can use text prompts to generate code. Code Llama is state-of-the-art for publicly available LLMs on code tasks, and has the potential to make workflows faster and more efficient for current developers and lower the barrier to entry for people who are learning to code. Code Llama has the potential to be used as a productivity an
- meta
- llama
- Python
- 機械学習
- あとで読む
「Visual Studio Code」バージョン1.91公開　拡張機能の開発を効率化する「Chat API」「Language Model API」が利用可能に
- 20 users
- atmarkit.itmedia.co.jp
- テクノロジー
- 2024/07/09
Microsoftは2024年7月4日（米国時間）、WindowsやLinux、macOSに対応するエディタ「Visual Studio Code」（以下、VS Code）のバージョン1.91（June 2024）を公開した。バージョン1.91ではソース管理、ワークベンチ、言語、拡張機能関連などの機能が強化されている。主なアップデート内容は以下の通り。ソース管理：変更をグラフで視覚化（プレビュー段階）変更をグラフで視覚化する実験的な機能が導入された。グラフには、現在のブランチ、現在のブランチの上流ブランチ、オプションのベースブランチが含まれる。グラフのルートは、これらブランチの共通の祖先だ。関連記事「Visual Studio Code」バージョン1.90リリース　「GPT-4」Copilot Chatモデルへのアクセスなど機能追加 MicrosoftはVisual Studio
大規模言語モデル（LLM：Large Language Model）とは？
- 18 users
- atmarkit.itmedia.co.jp
- テクノロジー
- 2023/03/13
大規模言語モデル（LLM：Large Language Model）とは？：AI・機械学習の用語辞典連載目次用語解説大規模言語モデル（LLM：Large Language Models）とは、大量のテキストデータを使ってトレーニングされた自然言語処理のモデルのことである。一般的には大規模言語モデルをファインチューニングなどすることによって、テキスト分類や感情分析、情報抽出、文章要約、テキスト生成、質問応答といった、さまざまな自然言語処理（NLP：Natural Language Processing）タスクに適応できる（図1）。大規模言語モデルの代表例としては、2018年にGoogleが発表した「BERT」や、2020年にOpenAIが発表した「GPT-3」などが挙げられる。2022年12月に発表された「ChatGPT」は、2022年初頭にトレーニングした「GPT-3.5シリーズ」
GitHub - yandex/YaLM-100B: Pretrained language model with 100B parameters
- 15 users
- github.com/yandex
- テクノロジー
- 2022/06/23
YaLM 100B is a GPT-like neural network for generating and processing text. It can be used freely by developers and researchers from all over the world. The model leverages 100 billion parameters. It took 65 days to train the model on a cluster of 800 A100 graphics cards and 1.7 TB of online texts, books, and countless other sources in both English and Russian. Training details and best practices o
- Yandex
- 機械学習
- github
- language
- データ
GitHub - jart/emacs-copilot: Large language model code completion for Emacs
- 14 users
- github.com/jart
- テクノロジー
- 2023/12/31
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- emacs
- AI
- github
- あとで読む
GitHub - BlinkDL/ChatRWKV: ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
- 14 users
- github.com/BlinkDL
- テクノロジー
- 2023/01/23
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- ai
Metaがコードのコンパイルや最適化を行える商用利用可能な大規模言語モデル「Meta Large Language Model Compiler」をリリース
- 14 users
- gigazine.net
- テクノロジー
- 2024/06/28
Metaがコードをコンパイルしつつ最適化するという大規模言語モデル「Meta Large Language Model Compiler」をリリースしました。モデルは商用利用可能で、Hugging Faceにてホストされています。 Meta Large Language Model Compiler: Foundation Models of Compiler Optimization | Research - AI at Meta https://ai.meta.com/research/publications/meta-large-language-model-compiler-foundation-models-of-compiler-optimization/ Today we’re announcing Meta LLM Compiler, a family of models
OWASP Top 10 for Large Language Model Applications | OWASP Foundation
- 10 users
- owasp.org
- テクノロジー
- 2023/08/16
This website uses cookies to analyze our traffic and only share that information with our analytics partners. Accept The OWASP Top 10 for Large Language Model Applications project aims to educate developers, designers, architects, managers, and organizations about the potential security risks when deploying and managing Large Language Models (LLMs). The project provides a list of the top 10 most c
- LLM
- security
- ai
Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrou
- 10 users
- ai.googleblog.com
- テクノロジー
- 2022/04/05
Philosophy We strive to create an environment conducive to many different types of research across many different time scales and levels of risk. Learn more about our Philosophy Learn more
How to get meaning from text with language model BERT | AI Explained
- 10 users
- www.youtube.com
- テクノロジー
- 2022/02/13
In this video, we give a step-by-step walkthrough of self-attention, the mechanism powering the deep learning model BERT, and other state-of-the-art transformer models for natural language processing (NLP). More on attention and BERT: https://bit.ly/38vpOyW How to solve a text classification problem with BERT with this tutorial: https://bit.ly/2Ij6tGa 0:00 Introduction of NLP 0:39 Text tokenizati
- BERT
- 機械学習
- Transformer
- AI
- あとで読む
GitHub - Hannibal046/Awesome-LLM: Awesome-LLM: a curated list of Large Language Model
- 9 users
- github.com/Hannibal046
- テクノロジー
- 2023/03/28
If you're interested in the field of LLM, you may find the above list of milestone papers helpful to explore its history and state-of-the-art. However, each direction of LLM offers a unique set of insights and contributions, which are essential to understanding the field as a whole. For a detailed list of papers in various subfields, please refer to the following link: Awesome-LLM-hallucination -
- LLM
- AI
- あとで読む
How to train a new language model from scratch using Transformers and Tokenizers
- 9 users
- huggingface.co
- テクノロジー
- 2020/02/15
How to train a new language model from scratch using Transformers and Tokenizers Over the past few months, we made several improvements to our transformers and tokenizers libraries, with the goal of making it easier than ever to train a new language model from scratch. In this post we’ll demo how to train a “small” model (84 M parameters = 6 layers, 768 hidden size, 12 attention heads) – that’s th
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
- 9 users
- arxiv.org
- テクノロジー
- 2024/04/23
We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone. The innovation lies entirely in our dataset
- あとで読む
LayoutLM (Layout Language Model)を試したら精度がめっちゃ上がった件について | 株式会社シナモン（シナモンAI）
- 8 users
- cinnamon.ai
- テクノロジー
- 2021/01/18
技術 LayoutLM (Layout Language Model)を試したら精度がめっちゃ上がった件について 2021.01.18 こんにちは。シナモンAI広報担当です。シナモンAIでは自然言語処理技術を用いたプロダクトの Aurora Clipper（オーロラ・クリッパー）をご提供しており、特定の文脈を持つ日付（イベント開催日や契約日等）や人物名（契約者の関係）の取得、長い文章からの要点抽出、テキストの分類など様々な用途で用いられる製品です。今回は、Aurora Clipperの基礎となるモデルとして、LayoutLMと呼ばれるアルゴリズムを実験した結果をAurora Clipperの開発をリードする藤井からご紹介いたします。テキストの位置を特徴として利用するLayoutLMとは？ LayoutLM（Layout Language Model）とは、Microsoft Re
BloombergGPT: A Large Language Model for Finance
- 8 users
- arxiv.org
- テクノロジー
- 2023/03/31
The use of NLP in the realm of financial technology is broad and complex, with applications ranging from sentiment analysis and named entity recognition to question answering. Large Language Models (LLMs) have been shown to be effective on a variety of tasks; however, no LLM specialized for the financial domain has been reported in literature. In this work, we present BloombergGPT, a 50 billion pa
GitHub - hiroshi-matsuda-rit/NLP2024-tutorial-3: NLP2024 チュートリアル３作って学ぶ日本語大規模言語モデル - 環境構築手順とソースコード / NLP2024 Tutorial 3: Practicing how to build a Japanese large-scale language model - Environment construction and experimental source codes
- 7 users
- github.com/hiroshi-matsuda-rit
- テクノロジー
- 2024/03/11
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- あとで読む
GitHub - XiongjieDai/GPU-Benchmarks-on-LLM-Inference: Multiple NVIDIA GPUs or Apple Silicon for Large Language Model Inference?
- 7 users
- github.com/XiongjieDai
- テクノロジー
- 2024/05/14
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
GitHub - tanreinama/GPTSAN: General-purpose Swich transformer based Japanese language model
- 7 users
- github.com/tanreinama
- テクノロジー
- 2022/09/19
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- github
- あとで読む
GitHub - SJTU-IPADS/PowerInfer: High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
- 7 users
- github.com/SJTU-IPADS
- テクノロジー
- 2023/12/20
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- あとで読む
TextPruner による大規模言語モデルの軽量化 / Large language model pruning using TextPruner
- 6 users
- speakerdeck.com/misawann
- テクノロジー
- 2022/05/24
2022/05/13 の NLP Hacks で LT をした際の発表資料
- あとで読む
日本経済新聞が「NIKKEI Language Model」の開発を発表　40年分の記事情報で磨かれた経済情報特化型大規模言語モデル | Ledge.ai
- 6 users
- ledge.ai
- テクノロジー
- 2024/05/01
Top > ビジネス > 日本経済新聞が「NIKKEI Language Model」の開発を発表　40年分の記事情報で磨かれた経済情報特化型大規模言語モデル
- 人工知能
- AI
- 日本
- 経済
The Rise and Potential of Large Language Model Based Agents: A Survey
- 6 users
- arxiv.org
- テクノロジー
- 2023/09/17
For a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are artificial entities that sense their environment, make decisions, and take actions. Many efforts have been made to develop intelligent agents, but they mainly focus on advancement in algorithms or training stra
- survey
- AI
- LLM
- 言語
- アニメ
- ゲーム
Cramming: Training a Language Model on a Single GPU in One Day
- 6 users
- arxiv.org
- 学び
- 2023/01/03
Recent trends in language modeling have focused on increasing performance through scaling, and have resulted in an environment where training language models is out of reach for most researchers and practitioners. While most in the community are asking how to push the limits of extreme computation, we ask the opposite question: How far can we get with a single GPU in just one day? We investigate t
RAFT: Adapting Language Model to Domain Specific RAG
- 6 users
- arxiv.org
- テクノロジー
- 2024/03/19
Pretraining Large Language Models (LLMs) on large corpora of textual data is now a standard paradigm. When using these LLMs for many downstream applications, it is common to additionally bake in new knowledge (e.g., time-critical news, or private domain knowledge) into the pretrained model either through RAG-based-prompting, or fine-tuning. However, the optimal methodology for the model to gain su
PaLM-E: An Embodied Multimodal Language Model
- 6 users
- palm-e.github.io
- テクノロジー
- 2023/03/07
Danny Driess1,2 Fei Xia1 Mehdi S. M. Sajjadi3 Corey Lynch1 Aakanksha Chowdhery3 Brian Ichter1 Ayzaan Wahid1 Jonathan Tompson1 Quan Vuong1 Tianhe Yu1 Wenlong Huang1 Yevgen Chebotar1 Pierre Sermanet1 Daniel Duckworth3 Sergey Levine1 Vincent Vanhoucke1 Karol Hausman1 Marc Toussaint2 Klaus Greff3 Andy Zeng1 Igor Mordatch3 Pete Florence1 1 2 3 Abstract Large language models have been demonstrated to pe
- Google
ScreenAI: A visual language model for UI and visually-situated language understanding
- 5 users
- research.google
- テクノロジー
- 2024/03/21
Philosophy We strive to create an environment conducive to many different types of research across many different time scales and levels of risk. Learn more about our Philosophy Learn more
- AI
- Google
GitHub - pfnet-research/japanese-lm-fin-harness: Japanese Language Model Financial Evaluation Harness
- 5 users
- github.com/pfnet-research
- テクノロジー
- 2023/12/04
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- あとで読む
OpenAI's GPT-3 Language Model: A Technical Overview
- 5 users
- lambdalabs.com
- テクノロジー
- 2020/07/21
Notice GPT-2 1.5B is trained with 40GB of Internet text, which is roughly 10 Billion tokens (conversely assuming the average token size is 4 characters). So GPT-3 175B has a lower data compression ratio 300 / 175 = 1.71 in comparison to GPT-2 1.5G 10 / 1.5 = 6.66. This raises the question that, with this amount of parameters, whether the model functions by memorizing the data in the training and p
- あとで読む
GitHub - WooooDyy/LLM-Agent-Paper-List: The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.
- 5 users
- github.com/WooooDyy
- テクノロジー
- 2023/09/21
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- あとで読む
Large Language Model(LLM)をもっと活用したい！"LangChain"を使ってみました。 - CCCMKホールディングス TECH Labの Tech Blog
- 5 users
- techblog.cccmk.co.jp
- テクノロジー
- 2023/04/18
こんにちは、CCCMKホールディングス TECH LABの三浦です。 "シャドーイング"という英語の学習方法があり、最近試してみています。これは英語の音声を聞きながら、それを追いかけるように発音する、という方法で、ヒアリングやスピーキング力の改善に効果があるそうです。英語を発音しようとするとなかなか思ったように口が回らないのですが、英語を話すための口周りの筋肉が整っていない、とったことも要因としてあるようです。動画を見ながら発声練習を始めてみたので、今後改善されるといいな、と期待しています。最近はLarge Language Model(LLM)について、毎日のように新しい情報がインターネットなどで見つかります。本当にホットな話題なんだな、と感じています。このブログでも最近LLMによりよい指示を与えるためのPrompt Engineeringのテクニックについて、最近発表された論文などを
GitHub - databrickslabs/dolly: Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform
- 5 users
- github.com/databrickslabs
- テクノロジー
- 2023/03/25
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- あとで読む
Stealing Part of a Production Language Model
- 5 users
- arxiv.org
- テクノロジー
- 2024/03/12
We introduce the first model-stealing attack that extracts precise, nontrivial information from black-box production language models like OpenAI's ChatGPT or Google's PaLM-2. Specifically, our attack recovers the embedding projection layer (up to symmetries) of a transformer model, given typical API access. For under \$20 USD, our attack extracts the entire projection matrix of OpenAI's Ada and Ba
- セキュリティ
How the RWKV language model works
- 4 users
- johanwind.github.io
- テクノロジー
- 2023/04/20
In this post, I will explain the details of how RWKV generates text. For a high level overview of what RWKV is and what is so special about it, check out the other post about RWKV. To explain exactly how RWKV works, I think it is easiest to look at a simple implementation of it. The following ~100 line code (based on RWKV in 150 lines) is a minimal implementation of a relatively small (430m parame
The RWKV language model: An RNN with the advantages of a transformer
- 4 users
- johanwind.github.io
- テクノロジー
- 2023/03/30
For a while, I’ve been following and contributing to the RWKV language model, an open source large language model with great potential. As ChatGPT and large language models in general have gotten a lot of attention recently, I think it’s a good time to write about RWKV. In this post, I will try to explain what is so special about RWKV compared to most language models (transformers). The other RWKV
Mixture-of-Agents Enhances Large Language Model Capabilities
- 4 users
- arxiv.org
- テクノロジー
- 2024/06/18
Recent advances in large language models (LLMs) demonstrate substantial capabilities in natural language understanding and generation tasks. With the growing number of LLMs, how to harness the collective expertise of multiple LLMs is an exciting open direction. Toward this goal, we propose a new approach that leverages the collective strengths of multiple LLMs through a Mixture-of-Agents (MoA) met
- research
- あとで読む
Mapping the Mind of a Large Language Model
- 4 users
- www.anthropic.com
- 暮らし
- 2024/05/22
Today we report a significant advance in understanding the inner workings of AI models. We have identified how millions of concepts are represented inside Claude Sonnet, one of our deployed large language models. This is the first ever detailed look inside a modern, production-grade large language model. This interpretability discovery could, in future, help us make AI models safer. We mostly trea
GitHub - salesforce/ctrl: Conditional Transformer Language Model for Controllable Generation
- 4 users
- github.com/salesforce
- テクノロジー
- 2019/09/18
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.
- OSS
- 文章
- ツール
Jamba: A Hybrid Transformer-Mamba Language Model
- 4 users
- arxiv.org
- テクノロジー
- 2024/04/01
We present Jamba, a new base large language model based on a novel hybrid Transformer-Mamba mixture-of-experts (MoE) architecture. Specifically, Jamba interleaves blocks of Transformer and Mamba layers, enjoying the benefits of both model families. MoE is added in some of these layers to increase model capacity while keeping active parameter usage manageable. This flexible architecture allows reso
Turing-NLG: A 17-billion-parameter language model by Microsoft - Microsoft Research
- 4 users
- www.microsoft.com
- テクノロジー
- 2020/02/12
This figure was adapted from a similar image published in DistilBERT (opens in new tab). Turing Natural Language Generation (T-NLG) is a 17 billion parameter language model by Microsoft that outperforms the state of the art on many downstream NLP tasks. We present a demo of the model, including its freeform generation, question answering, and summarization capabilities, to academics for feedback a
- Microsoft