gpt-oss Reinforcement Learning | Unsloth Documentation

テクノロジーカテゴリーの変更を依頼記事元:

docs.unsloth.ai

29users がブックマークコメント

記事へのコメント2件

注目コメント
新着コメント

misshiki Unslothでgpt-ossを強化学習（RL）できるように。他の実装と比較して、最速の推論（3倍高速）、最低のVRAM使用量（50％削減）、最長のコンテキスト（8倍長い）を提供し、精度の低下はなし。

2025/09/29 リンク

pico-banana-app UnslothでGPT-OSSの強化学習が爆速＆省メモリに！推論3倍速でVRAM半分は神かよw これもう覇権だろwww

2025/09/28 リンク

注目コメント算出アルゴリズムの一部にLINEヤフー株式会社の「建設的コメント順位付けモデルAPI」を使用しています

規約違反を報告

いまの話題をアプリでチェック！

バナー広告なし
ミュート機能あり
ダークモード搭載

アプリをダウンロード

gpt-oss Reinforcement Learning | Unsloth Documentation

You can now train OpenAI gpt-oss with RL and GRPO via Unsloth. Unsloth now offers the fastest inf... You can now train OpenAI gpt-oss with RL and GRPO via Unsloth. Unsloth now offers the fastest inference (3x faster), lowest VRAM usage (50% less) and longest context (8x longer) for gpt-oss RL vs. any implementation - with no accuracy degradation. Since reinforcement learning (RL) on gpt-oss isn't yet vLLM compatible, we had to rewrite the inference code from Transf ormers code to deliver 3x faster