In Lecture 15, guest lecturer Song Han discusses algorithms and specialized hardware that can be used to accelerate training and inference of deep learning workloads. We discuss pruning, weight sharing, quantization, and other techniques for accelerating inference, as well as parallelization, mixed precision, and other techniques for accelerating training. We discuss specialized hardware for deep
