计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (11): 251-257.DOI: 10.3778/j.issn.1002-8331.2304-0352

• 图形图像处理 • 上一篇    下一篇

改进YOLOv5的智慧课堂人脸检测算法

钟源,袁家政,李鸿天,刘宏哲,徐成   

  1. 1.北京联合大学 北京市信息服务工程重点实验室,北京 100101
    2.北京联合大学 脑与认知智能北京实验室,北京 100101
    3.北京开放大学 科学技术学院,北京 100081
  • 出版日期:2024-06-01 发布日期:2024-05-31

Intelligent Classroom Face Detection Algorithm with Improved YOLOv5

ZHONG Yuan, YUAN Jiazheng, LI Hongtian, LIU Hongzhe, XU Cheng   

  1. 1.Beijing Key Laboratory of Information Service Engineering, Beijing Union University, Beijing 100101, China
    2.Institute for Brain & Cognitive Sciences, Beijing Union University, Beijing 100101, China
    3.School of Science and Technology, Beijing Open University, Beijing 100081, China
  • Online:2024-06-01 Published:2024-05-31

摘要: 智慧课堂是人工智能领域热门的应用场景。针对课堂场景下摄像头位置较远且偏,图像中目标存在人脸过小和遮挡导致漏检或错检等问题,提出了一种改进YOLOv5的智慧课堂人脸检测算法YOLOv5-SASA。该算法主要包括三个部分,在backbone层沿用了CSPDarknet53网络,通过在最后的空间池化层中使用BasicRFB模块来有效增强网络的特征提取能力;采用NWD损失函数来提高模型对小目标检测的鲁棒性,同时在head层中引入了独立自注意力机制模块SASA,以解决人脸遮挡的问题,并降低模型的参数量;通过降低中间层通道神经元的数量、调节学习率等方式,对改进的YOLOv5网络进行了优化,以避免模型过拟合。实验结果表明,所提出的方法在WiderFace验证集的easy、medium和hard难度下的效果均优于原网络,分别达到了97.5%、96.3%和86.5%的准确率,能够有效提升课堂场景下人脸检测的精度。

关键词: 智慧课堂, 人脸检测, YOLOv5, 独立自注意力机制

Abstract: The intelligent classroom is a popular application scenario in the field of artificial intelligence. This paper proposes a face detection algorithm based on improved YOLOv5, named YOLOv5-SASA, to address the issues of missed or false detection caused by small or occluded faces in images captured by cameras located far away or at an angle. The algorithm consists of three parts. Firstly, the CSPDarknet53 network is utilized in the backbone layer, and the BasicRFB module is used in the final spatial pooling layer to enhance the network’s feature extraction ability. Secondly, the NWD loss function is employed to improve the model’s robustness in detecting small targets. Thirdly, the independent self-attention mechanism module SASA is introduced in the head layer to address the issue of face occlusion and reduce the model’s parameter count. Finally, the improved YOLOv5 network is optimized by reducing the number of neurons in the middle layer channels and adjusting the learning rate to avoid overfitting. Experimental results demonstrate that the proposed method outperforms the original network in the easy, medium, and hard levels of the WiderFace validation set, achieving accuracies of 97.5%, 96.3%, and 86.5%, respectively, which effectively improves the accuracy of face detection in classroom scenarios.

Key words: smart classroom, face detection, YOLOv5, stand-alone self-attentio