tasukuchanのブックマーク - はてなブックマーク

Preparing for the era of 32K context: Early learnings and explorations
Today, we’re releasing LLaMA-2-7B-32K, a 32K context model built using Position Interpolation and Together AI’s data recipe and system optimizations, including FlashAttention-2. Fine-tune the model for targeted, long-context tasks—such as multi-document understanding, summarization, and QA—and run inference and fine-tune on 32K context with up to 3x speedup. LLaMA-2-7B-32K making completions of a
tasukuchan 2023/07/29
リンク
Releasing 3B and 7B RedPajama-INCITE family of models including base, instruction-tuned & chat models
Releasing 3B and 7B RedPajama-INCITE family of models including base, instruction-tuned & chat models The RedPajama project aims to create a set of leading open-source models and to rigorously understand the ingredients that yield good performance. A few weeks ago we released the RedPajama base dataset based on the LLaMA paper, which has galvanized the open-source community. The 5 terabyte datase
tasukuchan 2023/05/06
リンク
1

はてなブックマーク