Machine Learning Transformers Model Compression A Survey of Methods for Model Compression in NLP A foray into numeric precision reduction, operation fusion, pruning, knowledge distillation, and module replacement. by Madison May