计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (17): 174-180.DOI: 10.3778/j.issn.1002-8331.2101-0089

• 模式识别与人工智能 • 上一篇    下一篇

基于知识蒸馏的YOLOv3算法研究

李姜楠,伍星,刘竞升,王洪刚   

  1. 重庆大学 计算机学院,重庆 400000
  • 出版日期:2022-09-01 发布日期:2022-09-01

Research of YOLOv3 Based on Knowledge Distillation

LI Jiangnan, WU Xing, LIU Jingsheng, WANG Honggang   

  1. College of Computer Science, Chongqing University, Chongqing 400000, China
  • Online:2022-09-01 Published:2022-09-01

摘要: 知识蒸馏作为一种模型压缩方法,将大网络(教师网络)学到的知识传递给小网络(学生网络),使小网络获得接近大网络的精度。知识蒸馏在图像分类任务上获得不错的效果,但在目标检测上的研究较少,且有待提高。当前目标检测中主要基于特征提取层进行知识蒸馏,该类方法存在两个问题,第一,没有对教师网络传递知识的重要程度进行度量,第二,仅对特征提取层进行蒸馏,教师网络的知识未充分传递给学生网络。针对第一个问题,通过引入信息图作为蒸馏的监督信号,强化了学生网络对教师网络重点知识的学习;针对第二个问题,对特征提取层和特征融合层的输出同时进行蒸馏,使学生网络更充分地学习教师网络传递的知识。实验结果表明,以YOLOv3为检测模型,在不改变学生网络结构的基础上,平均类别精度(mAP)提升9.3个百分点。

关键词: 知识蒸馏, 模型压缩, 目标检测, YOLOv3

Abstract: As a model compression method, knowledge distillation transfers the knowledge from a large network(teacher network) to a small network(student network), making the accuracy of the small network closer to that of the large network. Knowledge distillation achieves good effect in image classification, but there is less research on object detection, and it needs to be improved. The current distillation methods of object detection are mainly based on the distillation of the feature extraction layer. However, there are two problems. Firstly, the importance of knowledge transmitted by the teacher network is not measured. Secondly, only the output of the feature extraction layer is distilled. Teacher network cannot fully transfer knowledge to student network. For the first problem, information map is introduced as the supervision signal of distillation, which strengthens the learning of key knowledge of the teacher network by the student network. For the second problem, the output of the feature extraction layer and the feature fusion layer are distilled at the same time. Student model can learn more about the knowledge delivered by teacher network. Experimental results show that mAP index value can improve 9.3 percentage points without changing network structure of student network based on YOLOv3.

Key words: knowledge distillation, model compression, object detection, YOLOv3