Robust Face Detection Using YOLOv3 Fusion Super Resolution Reconstruction

doi:10.3778/j.issn.1002-8331.2103-0052

Abstract

Abstract: Due to the face detection in complex scenes is affected by image quality, face scale, light and other factors, it is a very challenging task to accurately locate small faces and avoid missing and false detection. This paper proposes a two-level face detection model, SR-Yolov3, which based on YOLOv3 and fusion of image super resolution reconstruction technology. In view of the missing detection problem of small-scale faces in scenes, the K-means++ algorithm is used to carry out clustering analysis on the anchor boxes, and smaller anchor boxes are set to capture the information of small faces. Aiming at the problem of false detection of fuzzy small-scale faces, the Darknet53 is used as the backbone network, and the SRGAN image super-resolution reconstruction module is integrated to enhance the data of low-resolution faces, forming a detection network that can improve the detection performance of low-resolution small faces. The WIDERFACE dataset is used to train and test the SR-YOLOv3 model. And compared with MTCNN, CMS-RCNN, HR and S3FD algorithms, it is verified that the proposed model has higher detection precision, especially the performance improvement on the hard set is the most obvious. SR-YOLOv3 can effectively use face information to accurately detect hard-to-detect face targets in complex scenes, with good robustness.

Key words: face detection, YOLOv3, super resolution, convolutional neural network

摘要： 复杂场景中的人脸检测由于受到图像质量、人脸尺度和光线等因素影响，精准地定位小人脸、避免漏检、误检是一件极具挑战性的任务。提出了一种基于YOLOv3、融合图像超分辨率重建技术的两级人脸检测模型SR-YOLOv3。针对场景中小人脸目标的漏检问题，利用K-means++算法对先验框进行聚类分析，设置更小尺寸的先验框来捕获小人脸信息；针对模糊小尺度人脸的误检问题，采用Darknet53作为主干网络，融入SRGAN图像超分辨率重建模块对低分辨率的人脸进行数据增强，形成一个可以提高低分辨率小人脸检测性能的检测网络。利用WIDERFACE数据集对SR-YOLOv3模型进行训练和测试，并与MTCNN、CMS-RCNN、HR、S3FD算法相比，验证了提出的模型具有更高的检测精确度，尤其是在hard子集上的性能提升最为明显。SR-YOLOv3能够有效地利用人脸信息，精准检测出复杂场景中的难检测人脸目标，具有很好的鲁棒性。

关键词: 人脸检测, YOLOv3, 图像超分辨率, 卷积神经网络

ZHAO Junyan, JIANG Ailian, QIANG Yan. Robust Face Detection Using YOLOv3 Fusion Super Resolution Reconstruction[J]. Computer Engineering and Applications, 2022, 58(19): 250-256.

赵军艳, 降爱莲, 强彦. YOLOv3融合图像超分辨率重建的鲁棒人脸检测[J]. 计算机工程与应用, 2022, 58(19): 250-256.

References

[1] 张祥越，丁庆海，罗海波，等.基于改进LCM的红外小目标检测算法[J].红外与激光工程，2017，46（7）：270-276.
ZHANG X Y，DING Q B，LUO H B，et al.Infrared dim target detection algorithm based on improved LCM[J].Infrared and Laser Engineering，2017，46（7）：270-276.
[2] ZOU Z，SHI Z，GUO Y，et al.Object detection in 20 years：A survey[J].arXiv：1905.05055，2019.
[3] GIRSHICK R，DONAHUE J，DARRELL T，et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition，2014：580-587.
[4] Girshick R.Fast R-CNN[C]//Proceedings of IEEE International Conference on Computer Vision，Santiago，2015：1440-1448.
[5] REN S Q，HE K M，ROSS G，et al.Faster R-CNN：Towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis & Machine Intelligence，2015，39（6）：1137-1149.
[6] HE K，ZHANG X，REN S，et al.Spatial pyramid pooling in deep convolutional networks for visual recognition[J].IEEE Transactions on Pattern Analysis & Machine Inte-lligence，2014，37（9）：1904-1916.
[7] DAI J，LI Y，HE K，et al.R-FCN：Object detection via region-based fully convolutional networks[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems，2016.
[8] HE K M，GKIOXARI G，DOLLáR P，et al.Mask R-CNN[J].IEEE Transactions on Pattern Analysis & Machine Intelligence，2020，42（2）：386-397.
[9] REDMON J ，DIVVALA S，GIRSHICK R，et al.You only look once：Unified，real-time object detection[C]//Proceedings of Internatonal Conference on Computer Vision & Pattern Recognition，2016.
[10] LIU W，ANGUELOV D，ERHAN D，et al.SSD：Single shot multiBox detector[C]//Proceedings of European Conference on Computer Vision，2016.
[11] REDMON J，FARHADI A.YOLO9000：Better，faster，stronger[C]//Proceedings of IEEE Conference on Computer Vision & Pattern Recognition，2017：6517-6525.
[12] REDMON J，FARHADI A.YOLOv3：An incremental improvement[J].arXiv：1804.02767，2018.
[13] ZHU C，ZHENG Y，LUU K，et al.CMS-RCNN：Contextual multi-scale region-based CNN for unconstrained face detection[J].arXiv：1606.05413，2016.
[14] ZHANG K，ZHANG K，LI Z，et al.Joint face detection and alignment using multi-task cascaded convolutional networks[J].Signal Processing Letters，2016，23（10）：1499-1503.
[15] WAN S，CHEN Z，ZHANG T，et al.Bootstrapping face detection with hard negative examples[J].arXiv：1608. 02236，2016.
[16] HU P Y，RAMANAN D.Finding tiny faces[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition，2017：1522-1530.
[17] ZHANG S，ZHU X，LEI Z，et al.S3FD：Single shot scale-invariant face detector[C]//Proceedings of 2017 International Conference on Computer Vision，2017.
[18] 吴喆.基于深度学习的动态背景下航道船舶检测识别与跟踪研究[D].宜昌：三峡大学，2019.
WU Z.Research on detection，recognition and tracking of channel ships under dynamic background based on deep learning[D].Yichang：China Three Gorges University，2019.
[19] GOODFELLOW I J，POUGET-ABADIE J，MIRZA M，et al.Generative adversarial networks[C]//Advances in Neural Information Processing Systems，2014：2672-2680.
[20] LEDIG C，THEIS L，HUSZAR F，et al.Photo-realistic single image super-resolution using a generative adversarial network[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition，2016.
[21] YANG S，LUO P，CHEN C L，et al.WIDERFACE：A face detection benchmark[C]//Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition，2016：5525-5533.