To address these limitations, we are releasing ZeRO++, a system of communication optimization strategies built on top of ZeRO to offer unmatched efficiency for large model training, regardless of batch size limitations or cross-device bandwidth constraints. ZeRO++ leverages quantization, in combination with data, and communication remapping, to reduce total communication volume by 4x compared with