[B! NLP] hagino_3000のブックマーク

awesome-japanese-nlp-resources/docs/README.ja.md at main · taishi-i/awesome-japanese-nlp-resources

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

hagino_3000 2025/01/17

NLP

リンク

Kaggleで学んだBERTをfine-tuningする際のTips⑤〜ラベルなしデータ活用編〜 | 株式会社AI Shift

こんにちは！AIチームの戸田です！本記事では私がKaggleのコンペティションに参加して得た、Transf ormerをベースとした事前学習モデルのfine-tuningのTipsを共有させていただきます。以前も何件か同じテーマで記事を書かせていただきました。 Kaggleで学んだBERTをfine-tuningする際のTips①〜学習効率化編 Kaggleで学んだBERTをfine-tuningする際のTips②〜精度改善編〜 Kaggleで学んだBERTをfine-tuningする際のTips③〜過学習抑制編〜 Kaggleで学んだBERTをfine-tuningする際のTips④〜Adversarial Training編〜今回はラベルなしデータの活用について書かせていただきます。世の中の様々な問題を、蓄積された大量のデータを使った教師あり学習で解こうとする試みは多くなされてい

hagino_3000 2024/10/09

NLP
BERT

リンク

How to Fine-tune BERT Model for NER on a Custom Dataset

Fig 1: A Transf ormers Pipeline (Image from Hugging Face NLP course)IntroductionIn the world of Natural Language Processing (NLP), Named Entity Recognition (NER) is an important technique to identify and extract important entities/fields in any given text. For example, one common use case that we can think of is extracting candidate’s name, education, skills and companies worked for, from a resume/

hagino_3000 2024/08/29

NLP

リンク

医療分野での文埋め込みモデルの比較 - Qiita

前書き ChatGPTなどの大規模言語モデル(LLM)では，Hallucinationが課題の一つです．医療など内容の正確性が求められる分野では特に重要な課題で，LLMに外部データベースから正確な情報を与えた上で，生成を行うRetrieval augmentation Generation (RAG)が対策方法の一つになります． RAGでは，関連する情報を正確に検索する必要があり，文章の正確な意味を反映した埋め込み表現を得ることができる文埋め込みモデルが重要です．そこで，医療分野の日本語の文章に対して，文埋め込みモデルをSemantic Textual Similarity(STS)タスクで比較・検証してみます． 1. 方法 1.1. 検証対象のモデルとりあえず目に付いた以下の5つのモデルを使います． OpenAI/text-embedding-ada-002以外はすべてHuggi