GitHub - openai/evals: Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

テクノロジーカテゴリーの変更を依頼記事元:

github.com/openai

26users がブックマークコメント

記事へのコメント4件

注目コメント
新着コメント

shunkeen "期間限定で、高品質の評価に貢献する人にGPT-4アクセスを許可します。上記の指示に従い、スパムや低品質の投稿は無視されることに注意してください"／みんな頑張れー！AI研究者を抱えてる日本企業にも期待！！

2023/03/15 リンク

otiai10 “for evaluating OpenAI models”

2023/03/15 リンク

yaotti “OpenAI Evalsは、大規模な言語モデル（LLM）およびLLMを使用して構築されたシステムを評価するためのフレームワークです。”

OpenAI

2023/10/31 リンク

misshiki “Evals は、OpenAI モデルを評価するためのフレームワークであり、ベンチマークのオープンソースレジストリです。”

OpenAI
GPT-4

2023/03/15 リンク

otiai10 “for evaluating OpenAI models”

2023/03/15 リンク

注目コメント算出アルゴリズムの一部にLINEヤフー株式会社の「建設的コメント順位付けモデルAPI」を使用しています

規約違反を報告

GitHub - openai/evals: Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

You can now configure and run Evals directly in the OpenAI Dashboard. Get started → Evals provide... You can now configure and run Evals directly in the OpenAI Dashboard. Get started → Evals provide a framework for evaluating large language models (LLMs) or systems built using LLMs. We offer an existing registry of evals to test different dimensions of OpenAI models and the ability to write your own custom evals for use cases you care about. You can also use your data to build private evals which