タイトル「language-model」を検索 - はてなブックマーク

1 - 40 件 / 47件

新着順人気順

絞り込み

検索対象
ブックマーク数
期間
セーフサーチ

language-modelの検索結果1 - 40 件 / 47件

LMQL(Language Model Query Language)概観｜mah_lab / 西見公宏
- 52 users
- note.com/mahlab
- テクノロジー
- 2023/04/21
LMQL Playgroundでクエリを試すLMQLには動作を簡単に検証できるPlaygroundが用意されています。ローカルでPlaygroundを起動することもできます。まずはGetting Startedで紹介されている以下のクエリを実行します。 argmax "Hello[WHO]" from "openai/text-ada-001" where len(WHO) < 10「Run」ボタンをクリックするとOpenAIのAPI KEYを求められるので、入力します。実行するとModel Responseの枠に結果が表示されます。 LMQLの基本構造LMQLは記法的にはSQLと似ていて、以下のような構造を持っています。デコーダ節（Decoder Clause）：テキスト生成に使用するデコード・アルゴリズムを指定します。LMQLでは様々なデコード・アルゴリズムを選択することができ
- AI
- LLM
- NLP
- あとで読む
- 文章
Introducing Code Llama, a state-of-the-art large language model for coding
- 20 users
- ai.meta.com
- テクノロジー
- 2023/08/24
Today, we are releasing Code Llama, a large language model (LLM) that can use text prompts to generate code. Code Llama is state-of-the-art for publicly available LLMs on code tasks, and has the potential to make workflows faster and more efficient for current developers and lower the barrier to entry for people who are learning to code. Code Llama has the potential to be used as a productivity an
- meta
- llama
- Python
- 機械学習
- あとで読む
- AI
大規模言語モデル（LLM：Large Language Model）とは？
- 17 users
- atmarkit.itmedia.co.jp
- テクノロジー
- 2023/03/13
大規模言語モデル（LLM：Large Language Model）とは？：AI・機械学習の用語辞典連載目次用語解説大規模言語モデル（LLM：Large Language Models）とは、大量のテキストデータを使ってトレーニングされた自然言語処理のモデルのことである。一般的には大規模言語モデルをファインチューニングなどすることによって、テキスト分類や感情分析、情報抽出、文章要約、テキスト生成、質問応答といった、さまざまな自然言語処理（NLP：Natural Language Processing）タスクに適応できる（図1）。大規模言語モデルの代表例としては、2018年にGoogleが発表した「BERT」や、2020年にOpenAIが発表した「GPT-3」などが挙げられる。2022年12月に発表された「ChatGPT」は、2022年初頭にトレーニングした「GPT-3.5シリーズ」
GitHub - yandex/YaLM-100B: Pretrained language model with 100B parameters
- 15 users
- github.com/yandex
- テクノロジー
- 2022/06/23
YaLM 100B is a GPT-like neural network for generating and processing text. It can be used freely by developers and researchers from all over the world. The model leverages 100 billion parameters. It took 65 days to train the model on a cluster of 800 A100 graphics cards and 1.7 TB of online texts, books, and countless other sources in both English and Russian. Training details and best practices o
- Yandex
- 機械学習
- github
- language
- データ
GitHub - jart/emacs-copilot: Large language model code completion for Emacs
- 14 users
- github.com/jart
- テクノロジー
- 2023/12/31
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- emacs
- AI
- github
- あとで読む
GitHub - BlinkDL/ChatRWKV: ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
- 14 users
- github.com/BlinkDL
- テクノロジー
- 2023/01/23
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- ai
OWASP Top 10 for Large Language Model Applications | OWASP Foundation
- 10 users
- owasp.org
- テクノロジー
- 2023/08/16
This website uses cookies to analyze our traffic and only share that information with our analytics partners. Accept The OWASP Top 10 for Large Language Model Applications project aims to educate developers, designers, architects, managers, and organizations about the potential security risks when deploying and managing Large Language Models (LLMs). The project provides a list of the top 10 most c
- LLM
- security
- ai
Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrou
- 10 users
- ai.googleblog.com
- テクノロジー
- 2022/04/05
Philosophy We strive to create an environment conducive to many different types of research across many different time scales and levels of risk. Learn more about our Philosophy Learn more
How to get meaning from text with language model BERT | AI Explained
- 10 users
- www.youtube.com
- テクノロジー
- 2022/02/13
In this video, we give a step-by-step walkthrough of self-attention, the mechanism powering the deep learning model BERT, and other state-of-the-art transformer models for natural language processing (NLP). More on attention and BERT: https://bit.ly/38vpOyW How to solve a text classification problem with BERT with this tutorial: https://bit.ly/2Ij6tGa 0:00 Introduction of NLP 0:39 Text tokenizati
- BERT
- 機械学習
- Transformer
- HotEntry
- AI
- あとで読む
How to train a new language model from scratch using Transformers and Tokenizers
- 9 users
- huggingface.co
- テクノロジー
- 2020/02/15
How to train a new language model from scratch using Transformers and Tokenizers Over the past few months, we made several improvements to our transformers and tokenizers libraries, with the goal of making it easier than ever to train a new language model from scratch. In this post we’ll demo how to train a “small” model (84 M parameters = 6 layers, 768 hidden size, 12 attention heads) – that’s th
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
- 9 users
- arxiv.org
- テクノロジー
- 2024/04/23
We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone. The innovation lies entirely in our dataset
- あとで読む
LayoutLM (Layout Language Model)を試したら精度がめっちゃ上がった件について | 株式会社シナモン（シナモンAI）
- 8 users
- cinnamon.ai
- テクノロジー
- 2021/01/18
技術 LayoutLM (Layout Language Model)を試したら精度がめっちゃ上がった件について 2021.01.18 こんにちは。シナモンAI広報担当です。シナモンAIでは自然言語処理技術を用いたプロダクトの Aurora Clipper（オーロラ・クリッパー）をご提供しており、特定の文脈を持つ日付（イベント開催日や契約日等）や人物名（契約者の関係）の取得、長い文章からの要点抽出、テキストの分類など様々な用途で用いられる製品です。今回は、Aurora Clipperの基礎となるモデルとして、LayoutLMと呼ばれるアルゴリズムを実験した結果をAurora Clipperの開発をリードする藤井からご紹介いたします。テキストの位置を特徴として利用するLayoutLMとは？ LayoutLM（Layout Language Model）とは、Microsoft Re
GitHub - Hannibal046/Awesome-LLM: Awesome-LLM: a curated list of Large Language Model
- 8 users
- github.com/Hannibal046
- テクノロジー
- 2023/03/28
If you're interested in the field of LLM, you may find the above list of milestone papers helpful to explore its history and state-of-the-art. However, each direction of LLM offers a unique set of insights and contributions, which are essential to understanding the field as a whole. For a detailed list of papers in various subfields, please refer to the following link: Awesome-LLM-hallucination -
- LLM
- AI
- あとで読む
BloombergGPT: A Large Language Model for Finance
- 8 users
- arxiv.org
- テクノロジー
- 2023/03/31
The use of NLP in the realm of financial technology is broad and complex, with applications ranging from sentiment analysis and named entity recognition to question answering. Large Language Models (LLMs) have been shown to be effective on a variety of tasks; however, no LLM specialized for the financial domain has been reported in literature. In this work, we present BloombergGPT, a 50 billion pa
GitHub - tanreinama/GPTSAN: General-purpose Swich transformer based Japanese language model
- 7 users
- github.com/tanreinama
- テクノロジー
- 2022/09/19
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- github
- あとで読む
GitHub - SJTU-IPADS/PowerInfer: High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
- 7 users
- github.com/SJTU-IPADS
- テクノロジー
- 2023/12/20
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- あとで読む
TextPruner による大規模言語モデルの軽量化 / Large language model pruning using TextPruner
- 6 users
- speakerdeck.com/misawann
- テクノロジー
- 2022/05/24
2022/05/13 の NLP Hacks で LT をした際の発表資料
- あとで読む
日本経済新聞が「NIKKEI Language Model」の開発を発表　40年分の記事情報で磨かれた経済情報特化型大規模言語モデル | Ledge.ai
- 6 users
- ledge.ai
- テクノロジー
- 2024/05/01
Top > ビジネス > 日本経済新聞が「NIKKEI Language Model」の開発を発表　40年分の記事情報で磨かれた経済情報特化型大規模言語モデル
- 人工知能
- AI
- 日本
- 経済
The Rise and Potential of Large Language Model Based Agents: A Survey
- 6 users
- arxiv.org
- テクノロジー
- 2023/09/17
For a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are artificial entities that sense their environment, make decisions, and take actions. Many efforts have been made to develop intelligent agents, but they mainly focus on advancement in algorithms or training stra
- survey
- AI
- LLM
- 言語
- アニメ
- ゲーム
Cramming: Training a Language Model on a Single GPU in One Day
- 6 users
- arxiv.org
- 学び
- 2023/01/03
Recent trends in language modeling have focused on increasing performance through scaling, and have resulted in an environment where training language models is out of reach for most researchers and practitioners. While most in the community are asking how to push the limits of extreme computation, we ask the opposite question: How far can we get with a single GPU in just one day? We investigate t
RAFT: Adapting Language Model to Domain Specific RAG
- 6 users
- arxiv.org
- テクノロジー
- 2024/03/19
Pretraining Large Language Models (LLMs) on large corpora of textual data is now a standard paradigm. When using these LLMs for many downstream applications, it is common to additionally bake in new knowledge (e.g., time-critical news, or private domain knowledge) into the pretrained model either through RAG-based-prompting, or fine-tuning. However, the optimal methodology for the model to gain su
PaLM-E: An Embodied Multimodal Language Model
- 6 users
- palm-e.github.io
- テクノロジー
- 2023/03/07
Danny Driess1,2 Fei Xia1 Mehdi S. M. Sajjadi3 Corey Lynch1 Aakanksha Chowdhery3 Brian Ichter1 Ayzaan Wahid1 Jonathan Tompson1 Quan Vuong1 Tianhe Yu1 Wenlong Huang1 Yevgen Chebotar1 Pierre Sermanet1 Daniel Duckworth3 Sergey Levine1 Vincent Vanhoucke1 Karol Hausman1 Marc Toussaint2 Klaus Greff3 Andy Zeng1 Igor Mordatch3 Pete Florence1 1 2 3 Abstract Large language models have been demonstrated to pe
- Google
GitHub - pfnet-research/japanese-lm-fin-harness: Japanese Language Model Financial Evaluation Harness
- 5 users
- github.com/pfnet-research
- テクノロジー
- 2023/12/04
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- あとで読む
OpenAI's GPT-3 Language Model: A Technical Overview
- 5 users
- lambdalabs.com
- テクノロジー
- 2020/07/21
Notice GPT-2 1.5B is trained with 40GB of Internet text, which is roughly 10 Billion tokens (conversely assuming the average token size is 4 characters). So GPT-3 175B has a lower data compression ratio 300 / 175 = 1.71 in comparison to GPT-2 1.5G 10 / 1.5 = 6.66. This raises the question that, with this amount of parameters, whether the model functions by memorizing the data in the training and p
- あとで読む
GitHub - WooooDyy/LLM-Agent-Paper-List: The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.
- 5 users
- github.com/WooooDyy
- テクノロジー
- 2023/09/21
For a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing human level, with AI agents considered as a promising vehicle of this pursuit. AI agents are artificial entities that sense their environment, make decisions, and take actions. Due to the versatile and remarkable capabilities they demonstrate, large language models (LLMs) are regarded as potential sparks
- あとで読む
GitHub - XiongjieDai/GPU-Benchmarks-on-LLM-Inference: Multiple NVIDIA GPUs or Apple Silicon for Large Language Model Inference?
- 5 users
- github.com/XiongjieDai
- テクノロジー
- 2024/05/14
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
Large Language Model(LLM)をもっと活用したい！"LangChain"を使ってみました。 - CCCMKホールディングス TECH Labの Tech Blog
- 5 users
- techblog.cccmk.co.jp
- テクノロジー
- 2023/04/18
こんにちは、CCCMKホールディングス TECH LABの三浦です。 "シャドーイング"という英語の学習方法があり、最近試してみています。これは英語の音声を聞きながら、それを追いかけるように発音する、という方法で、ヒアリングやスピーキング力の改善に効果があるそうです。英語を発音しようとするとなかなか思ったように口が回らないのですが、英語を話すための口周りの筋肉が整っていない、とったことも要因としてあるようです。動画を見ながら発声練習を始めてみたので、今後改善されるといいな、と期待しています。最近はLarge Language Model(LLM)について、毎日のように新しい情報がインターネットなどで見つかります。本当にホットな話題なんだな、と感じています。このブログでも最近LLMによりよい指示を与えるためのPrompt Engineeringのテクニックについて、最近発表された論文などを
GitHub - databrickslabs/dolly: Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform
- 5 users
- github.com/databrickslabs
- テクノロジー
- 2023/03/25
Databricks’ Dolly is an instruction-following large language model trained on the Databricks machine learning platform that is licensed for commercial use. Based on pythia-12b, Dolly is trained on ~15k instruction/response fine tuning records databricks-dolly-15k generated by Databricks employees in capability domains from the InstructGPT paper, including brainstorming, classification, closed QA,
- あとで読む
Stealing Part of a Production Language Model
- 5 users
- arxiv.org
- テクノロジー
- 2024/03/12
We introduce the first model-stealing attack that extracts precise, nontrivial information from black-box production language models like OpenAI's ChatGPT or Google's PaLM-2. Specifically, our attack recovers the embedding projection layer (up to symmetries) of a transformer model, given typical API access. For under \$20 USD, our attack extracts the entire projection matrix of OpenAI's Ada and Ba
- セキュリティ
How the RWKV language model works
- 4 users
- johanwind.github.io
- テクノロジー
- 2023/04/20
In this post, I will explain the details of how RWKV generates text. For a high level overview of what RWKV is and what is so special about it, check out the other post about RWKV. To explain exactly how RWKV works, I think it is easiest to look at a simple implementation of it. The following ~100 line code (based on RWKV in 150 lines) is a minimal implementation of a relatively small (430m parame
The RWKV language model: An RNN with the advantages of a transformer
- 4 users
- johanwind.github.io
- テクノロジー
- 2023/03/30
For a while, I’ve been following and contributing to the RWKV language model, an open source large language model with great potential. As ChatGPT and large language models in general have gotten a lot of attention recently, I think it’s a good time to write about RWKV. In this post, I will try to explain what is so special about RWKV compared to most language models (transformers). The other RWKV
Mapping the Mind of a Large Language Model
- 4 users
- www.anthropic.com
- 暮らし
- 2024/05/22
Today we report a significant advance in understanding the inner workings of AI models. We have identified how millions of concepts are represented inside Claude Sonnet, one of our deployed large language models. This is the first ever detailed look inside a modern, production-grade large language model. This interpretability discovery could, in future, help us make AI models safer. We mostly trea
GitHub - salesforce/ctrl: Conditional Transformer Language Model for Controllable Generation
- 4 users
- github.com/salesforce
- テクノロジー
- 2019/09/18
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.
- OSS
- 文章
- ツール
Introducing LLaMA: A foundational, 65-billion-parameter language model
- 4 users
- ai.meta.com
- テクノロジー
- 2023/02/25
Introducing LLaMA: A foundational, 65-billion-parameter large language model UPDATE: We just launched Llama 2 - for more information on the latest see our blog post on Llama 2. As part of Meta’s commitment to open science, today we are publicly releasing LLaMA (Large Language Model Meta AI), a state-of-the-art foundational large language model designed to help researchers advance their work in thi
Turing-NLG: A 17-billion-parameter language model by Microsoft - Microsoft Research
- 4 users
- www.microsoft.com
- テクノロジー
- 2020/02/12
This figure was adapted from a similar image published in DistilBERT (opens in new tab). Turing Natural Language Generation (T-NLG) is a 17 billion parameter language model by Microsoft that outperforms the state of the art on many downstream NLP tasks. We present a demo of the model, including its freeform generation, question answering, and summarization capabilities, to academics for feedback a
- Microsoft
GitHub - Lightning-AI/lit-llama: Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
- 3 users
- github.com/Lightning-AI
- テクノロジー
- 2023/03/29
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- あとで読む
ScreenAI: A visual language model for UI and visually-situated language understanding
- 3 users
- research.google
- テクノロジー
- 2024/04/11
Philosophy We strive to create an environment conducive to many different types of research across many different time scales and levels of risk. Learn more about our Philosophy Learn more
The New Language Model Stack
- 3 users
- www.sequoiacap.com
- テクノロジー
- 2023/08/04
ChatGPT unleashed a tidal wave of innovation with large language models (LLMs). More companies than ever before are bringing the power of natural language interaction to their products. The adoption of language model APIs is creating a new stack in its wake. To better understand the applications people are building and the stacks they are using to do so, we spoke with 33 companies across the Sequo
- architecture
- あとで読む
GitHub - neuml/txtai: 💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
- 3 users
- github.com/neuml
- テクノロジー
- 2022/06/19
All-in-one embeddings database txtai is an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows. Embeddings databases are a union of vector indexes (sparse and dense), graph networks and relational databases. This enables vector search with SQL, topic modeling, retrieval augmented generation and more. Embeddings databases can stand on their own and/or
- oss
DarkBERT: A Language Model for the Dark Side of the Internet
- 3 users
- arxiv.org
- テクノロジー
- 2023/05/21
Recent research has suggested that there are clear differences in the language used in the Dark Web compared to that of the Surface Web. As studies on the Dark Web commonly require textual analysis of the domain, language models specific to the Dark Web may provide valuable insights to researchers. In this work, we introduce DarkBERT, a language model pretrained on Dark Web data. We describe the s
- あとで読む