Knowledge Distillation IT-Times Yizuo Media

Knowledge Distillation

4 1 week ago 1 week ago 1

知識蒸餾知识蒸馏/Model Distillation

Beim maschinellen Lernen ist die Wissensdestillation oder Modelldestillation der Prozess der Übertragung von Wissen von einem großen Modell auf ein kleineres Modell.

知识蒸馏（knowledge distillation）是人工智能领域的一项模型训练技术。该技术透过类似于教师—学生的方式，令规模较小、结构较为简单的人工智能模型从已经经过充足训练的大型、复杂模型身上学习其掌握的知识。该技术可以让小型简单模型快速有效学习到大型复杂模型透过漫长训练才能得到的结果，从而改善模型的效率、减少运算开销，因此亦被称为模型蒸馏（model distillation）。

In machine learning, knowledge distillation or model distillation is the process of transferring knowledge from a large model to a smaller one. While large models (such as very deep neural networks or ensembles of many models) have more knowledge capacity than small models, this capacity might not be fully utilized. It can be just as computationally expensive to evaluate a model even if it utilizes little of its knowledge capacity. Knowledge distillation transfers knowledge from a large model to a smaller one without loss of validity. As smaller models are less expensive to evaluate, they can be deployed on less powerful hardware (such as a mobile device).

Model distillation is not to be confused with model compression, which describes methods to decrease the size of a large model itself, without training a new model. Model compression generally preserves the architecture and the nominal parameter count of the model, while decreasing the bits-per-parameter.

Knowledge distillation has been successfully used in several applications of machine learning such as object detection, acoustic models, and natural language processing. Recently, it has also been introduced to graph neural networks applicable to non-grid data.

This image, video or audio may be copyrighted. It is used for educational purposes only. If you find it, please notify us byand we will remove it immediately.