NVIDIA® TensorRT™ is an ecosystem of APIs for high-performance deep learning inference. TensorRT includes an inference runtime and model optimizations that deliver low latency and high throughput for production applications. The TensorRT ecosystem includes TensorRT, TensorRT-LLM, TensorRT Model Optimizer, and TensorRT Cloud. NVIDIA TensorRT-based applications perform up to 36X faster than CPU-only
![NVIDIA TensorRT](https://cdn-ak-scissors.b.st-hatena.com/image/square/c6020c470f1d92435e26820b2446324459e041e1/height=288;version=1;width=512/https%3A%2F%2Fd29g4g2dyqv443.cloudfront.net%2Fsites%2Fdefault%2Ffiles%2Fakamai%2Ftensorrt-getting-started-og-1200x630.jpg)