GPT-Neo is the code name for a series of transformer-based language models loosely styled around the GPT architecture that we plan to train and open source. Our primary goal is to replicate a GPT-3 sized model and open source it to the public, for free. Along the way we will be running experiments with alternative architectures and attention types, releasing any intermediate models, and writing up