Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Paper
•
1502.03167
•
Published
•
1
Note Merely adding Batch Normalization to a state-of-the-art image classification model yields a substantial speedup in training. By further increasing the learning rates, removing Dropout, and applying other modifications afforded by Batch Normalization, we reach the previous state of the art with only a small fraction of training steps and then beat the state of the art in single-network image classification. Furthermore, by combining multiple models trained with Batch Normalization...