Photo by Circe Denyer on PublicDomainPictures.netUsually, when I see BatchNorm and Dropout layers in a neural network, I don’t pay them much attention. I tend to think of them as simple means to speed up training and improve generalization with no side effects when the network is in inference mode. In this post, I will show why this notion is not always correct, and may cause the neural network to