NVIDIA NCCL The NVIDIA Collective Communication Library (NCCL) implements multi-GPU and multi-node communication primitives optimized for NVIDIA GPUs and Networking. NCCL provides routines such as all-gather, all-reduce, broadcast, reduce, reduce-scatter as well as point-to-point send and receive that are optimized to achieve high bandwidth and low latency over PCIe and NVLink high-speed interconn
![NVIDIA Collective Communications Library (NCCL)](https://cdn-ak-scissors.b.st-hatena.com/image/square/c508c975172534ac2931db7e4309e753fccf91bb/height=288;version=1;width=512/https%3A%2F%2Fdeveloper.download.nvidia.com%2Fimages%2Fog-default.jpg)