***** New March 11th, 2020: Smaller BERT Models ***** This is a release of 24 smaller BERT models (English only, uncased, trained with WordPiece masking) referenced in Well-Read Students Learn Better: On the Importance of Pre-training Compact Models. We have shown that the standard BERT recipe (including model architecture and training objective) is effective on a wide range of model sizes, beyond