文章预览
公众号关注 “ ML_NLP ” 设为 “ 星标 ”,重磅干货,第一时间送达! 机器学习算法与自然语言处理出品 @公众号原创专栏作者 Don.hub 单位 | 京东算法工程师 学校 | 帝国理工大学 Overview Overview Pruning Why Pruning Weight Pruning Neuron Pruning Bert Pruning Knowledge Distillation Theory DistilBert Knowledge distillation architecture choice and initialization Performance and Ablation Bert-PKD Patient Knowledge Distillation architecture choice TinyBert general distillation task-specific distillation Ablation Studies summary Parameter Quantization Architecture Design Matrix Factorization Albert Matrix Factorization Cross-layer Parameter Sharing Sentence Order Prediction(SOP) Ablation Studies Factors affecting model performance Dynamic Computation ref appendix Depthwise Separable Convolution 本文中重要的bert模型
………………………………