“再トレーニングなしでLLMの検閲を解除できる「アブリタレーション」と呼ばれる手法について説明。モデルに組み込まれた拒否メカニズムを効果的に削除し、あらゆる種類のプロンプトに応答できるようにします。”

自然言語処理

misshiki のブックマーク 2024/06/14 15:13

<blockquote class="hatena-bookmark-comment"><a class="comment-info" href="https://b.hatena.ne.jp/entry/4754845511972081088/comment/misshiki" data-user-id="misshiki" data-entry-url="https://b.hatena.ne.jp/entry/s/huggingface.co/blog/mlabonne/abliteration" data-original-href="https://huggingface.co/blog/mlabonne/abliteration" data-entry-favicon="https://cdn-ak2.favicon.st-hatena.com/64?url=https%3A%2F%2Fhuggingface.co%2Fblog%2Fmlabonne%2Fabliteration" data-user-icon="/users/misshiki/profile.png">Uncensor any LLM with abliteration</a><ul class="comment-tag" style="list-style: none; margin: 0px;"><li style="float: left">[<a href="https://b.hatena.ne.jp/q/%E8%87%AA%E7%84%B6%E8%A8%80%E8%AA%9E%E5%87%A6%E7%90%86">自然言語処理</a>]</li></ul><br><p style="clear: left">“再トレーニングなしでLLMの検閲を解除できる「アブリタレーション」と呼ばれる手法について説明。モデルに組み込まれた拒否メカニズムを効果的に削除し、あらゆる種類のプロンプトに応答できるようにします。”</p><a class="datetime" href="https://b.hatena.ne.jp/misshiki/20240614#bookmark-4754845511972081088"><span class="datetime-body">2024/06/14 15:13</span></a></blockquote><script src="https://b.st-hatena.com/js/comment-widget.js" charset="utf-8" async></script>

このブックマークにはスターがありません。
最初のスターをつけてみよう！

Uncensor any LLM with abliteration

huggingface.co2024/06/13

The third generation of Llama models provided fine-tunes (Instruct) versions that excel in understanding and following instructions. However, these models are heavily censored, designed to refuse r...

2 人がブックマーク・1 件のコメント

他のコメントを読む

＼コメントがサクサク読めるアプリです／

はてなブックマーク

Uncensor any LLM with abliteration

はてなブックマーク

公式Twitter

はてなのサービス