人脸关键点检测研究综述

doi:10.3778/j.issn.1002-8331.2311-0387

摘要/Abstract

摘要： 随着计算机视觉等技术的快速发展，人机交互、医疗辅助、安防监控等领域迅速崛起，人脸关键点检测作为其中一项重要任务备受关注，它可以在图像或视频中定位和检测人脸关键点，具有很高的实用价值。通过对人脸关键点检测方法研究现状的梳理和分析，将其分为传统的人脸关键点检测方法和基于深度学习的人脸关键点检测方法；对比分析了各类方法的原理及优缺点，介绍常用数据集和评价指标，全面评估了重点方法在不同数据集上的性能表现；归纳人脸关键点检测应用领域，展望其未来发展方向。

关键词: 人脸关键点检测, 深度学习, 传统人脸关键点检测

Abstract: With the rapid development of computer vision and other technologies, the rapid rise of human-computer interaction, medical assistance, security monitoring and other fields, facial landmark detection as one of the important tasks of concern, it can locate and detect facial landmark in images or videos, with high practical value. By combing and analyzing the current research status of face key point detection methods, this paper divides them into traditional facial landmark detection methods and deep learning-based facial landmark detection methods.It compares and analyzes the principles, advantages and disadvantages of various methods, introduces common data sets and evaluation indicators, and comprehensively evaluates the performance of key methods on different data sets. The application field of facial landmark detection is summarized, and its future development direction is forecasted.

Key words: facial landmark detection, deep learning, traditional facial landmark detection

张晓行, 田启川, 廉露, 谭润. 人脸关键点检测研究综述[J]. 计算机工程与应用, 2024, 60(12): 48-60.

ZHANG Xiaohang, TIAN Qichuan, LIAN Lu, TAN Run. Review of Research on Facial Landmark Detection[J]. Computer Engineering and Applications, 2024, 60(12): 48-60.

参考文献

[1] LIU K, CAO G, ZHOU F, et al. Towards disentangling latent space for unsupervised semantic face editing[J]. IEEE Transa- ctions on Image Processing, 2022, 31: 1475-1489.
[2] HOU X, ZHANG X, LIANG H, et al. Guidedstyle: attribute knowledge guided style manipulation for semantic face editing[J]. Neural Networks, 2022, 145: 209-220.
[3] 吕建峰, 邵立珍, 雷雪梅. 基于深度神经网络的图像修复算法综述[J]. 计算机工程与应用, 2023, 59(20): 1-12.
LYU J F, SHAO L Z, LEI X M. Image inpainting algorithm based on deep neural networks[J].Computer Engineering and Applications, 2023, 59(20): 1-12.
[4] 王静婷, 李慧斌. 单张图像三维人脸重建方法综述[J]. 计算机工程与应用, 2023, 59(17): 1-21.
WANG J T, LI H B. Review of single-image 3D face reconstruction methods[J].Computer Engineering and Applictions, 2023, 59(17): 1-21.
[5] COOTES T F, TAYLOR C J, COOPER D H, et al. Active shape models-their training and application[J]. Computer Vision and Image Understanding, 1995, 61(1): 38-59.
[6] COOTES T F, EDWARDS G J, TAYLOR C J. A comparative evaluation of active appearance model algorithms[C]//Proceedings of the British Machine Vision Conference, 1998:680-689.
[7] COOTES T F, EDWARDS G J, TAYLOR C J. Active appearance models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(6): 681-685.
[8] WOLSTENHOLME C B H, TAYLOR C J. Wavelet compression of active appearance models[C]//Proceedings of the 2nd International Conference on Medical Image Computing and Computer-Assisted Intervention, 1999: 544-554.
[9] COOTES T F, TAYLOR C J. Constrained active appearance models[C]//Proceedings of the IEEE Conference on Computer Vision, 2001: 748-754.
[10] CRISTINACCE D, COOTES T F. Feature detection and tracking with constrained local models[C]//Procedings of the British Machine Vision Conference, 2006: 1-10.
[11] YAN S, LIU C, LI S Z, et al. Face alignment using texture- constrained active shape models[J]. Image and Vision Computing, 2003, 21(1): 69-75
[12] JOSEPH C, ERIK T. Fast and robust face alignment via random ferns[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011(1): 1-8.
[13] KHADEMI N, DAMEN D, MAYOL W. Active learning for constrained local models[C]//Proceedings of the BMVC, 2013.
[14] BELHUMEUR P N, JACOBS D W, KRIEGMAN D J, et al. Locallizing parts of faces using a consensus of exemplars[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2013: 2930-2940.
[15] DANTONE M, GALL J, FANELLI G, et al. Realtime facial feature detection using conditional regression forests[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2012: 2578-2585.
[16] DOLLAR P, WLINDER P, PERONA P. Cascaded pose regression[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010: 1078-1085.
[17] CAO X, WEI Y, WEN F, et al. Face alignment by explicit shape regression[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2012: 2887-2894.
[18] YANG H, MOU W, ZHANG Y, et al. Face alignment assisted by head pose estimation[J]. arXiv:1507.03148, 2015.
[19] REN S, CAO X, SUN J. Face alignment at 3000 FPS via regressing local binary features[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014: 1685-1692.
[20] BREIMAN L. Random forests[J]. Machine Learning, 2001, 45(1): 5-32.
[21] SUN Y, WANG X. Deep convolutional network cascade for facial point detections[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2013: 3476-3483.
[22] ZHOU E, FAN H, CAO Z, et al. Extensive facial landmark localization with coarse-to-fine convolutional network cascade[C]//Proceedings of the IEEE International Conference on Computer Vision Workshops, 2013: 386-391.
[23] LUO P, ZHANG Z, CHEN C L, et al. Facial landmark detection by deep multi-task learning[C]//Proceedings of the European Conference on Computer Vision, 2014: 94-108.
[24] ZHANG C, ZHANG Z. Improving multiview face detection with multi-task deep convolutional neural networks[C]//Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2014:1036-1041.
[25] VIDIT J, LEARNED-MILLER E. FDDB: a benchmark for face detection in unconstrained settings: Technical Report UM-CS-2010-009[R].2010: 1-11.
[26] ZHANG K, ZHANG Z, LI Z, et al. Joint face detection and alignment using multitask cascaded convolutional networks[J].IEEE Signal Processing Letters, 2016, 23(10): 1499-1503.
[27] HONARI S, MOLCHANOV P, TYREES, et al. Improving landmark localization with semi supervised learning[C]//Proceedings of the IEEE /CVF Conference on Computer Vision and Pattern Recognition, 2018: 1546-1555.
[28] IANDOLA F N, HAN S, MOSKEWICZ M W, et al. SqueezeNet: AlexNet-level accurarcy with 50x fewer parameters and <0.5?MB model size[J]. arXiv:1602.07360, 2016.
[29] HOWARD A G, ZHU M, CHEN B, et al. Efficient convolutional neural networks for mobile vision applications[J]. arXiv:1704.0486, 2017.
[30] SANDLER M, HOWARD A, ZHU M, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 4510-4520.
[31] HOWARD A, SANDLER M, CHEN T, et al. Searching for mobileNetV3[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 1314-1324.
[32] ZHANG X, ZHOU X, LIN M, et al. ShuffleNet: an extremely efficient convolution neural network for mobile devices[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 6848-6856.
[33] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770-778.
[34] HAN K, WANG Y, TIAN Q, et al. GhostNet: more features from cheap operations[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 1580-1589.
[35] 朱望纯,张博. 超轻量人脸关键点检测算法[J]. 电子测量技术, 2023, 46(5): 98-104.
ZHU W C, ZHANG B. Ultra-lightweight facial landmark detector[J]. Electronic Measurement Technology, 2023, 46(5): 98-104.
[36] GUO X, LI S, ZHANG J, et al. PFLD: a practical facial landmark detector[J]. arXiv:1902.10859, 2019.
[37] NEWELL A, YANG K, DENG J. Stacked hourglass networks for human pose estimation[C]//Proceedings of the European Conference on Computer Vision, 2016: 483-499.
[38] WANG X, BO L, FU X L. Adaptive wing loss for robust face alignment via heatmap regression[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 6971-6981.
[39] GAO P, LU K, LYU J. A coarse-to-fine facial landmark detection method based on self-attention mechanism[J].IEEE Transactions on Multimedia, 2021,23: 926-938.
[40] HUANG Y, HUANG H. Stacked attention hourglass network based robust facial landmark detection[J]. Neural Networks, 2023, 157: 323-335.
[41] HUANG Y, YANG H, LI C, et al. ADNet: leveraging error-bias towards normal direction in face alignment[J]. arXiv:2109.05721, 2021.
[42] KOWALSKI M, NARUNIEC J, TRZCINSKI T. Deep align- ment network: a convolutional neural network for robust face alignment[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision, 2017: 2034-2043.
[43] DAPOGNY A, BAILLY K, CORD M. DeCaFA: deep convolutional cascade for face alignment in the wild[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 6893-6901.
[44] WAN J, LIU J, MIN W. Precise facial landmark detection by reference heatmap Transformer[J]. IEEE Transactions on Image Processing, 2023, 32: 1966-1977.
[45] WMAHPOD S, DAS R, MAIORANA E, et al. Facial landmarks localization using cascaded neural networks[J]. Computer Vision and Image Understanding, 2021, 205: 103171.
[46] LAI Z, WAN J, LI J, et al. Robust facial landmark detection by multiorder multiconstraint deep networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 33(5): 2181-2194.
[47] BULAT A, TZIMIROPOULOS G. Super-FAN: integrated facial landmark localization and super-resolution of real world low resolution faces in arbitrary poses with gans[C]// Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2018: 109-117.
[48] SUN K, XIAO B. Deep high?resolution representation learning for human pose estimation[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019: 5686-5696.
[49] WAN J, LIU J, LAI Z, et al. Precise low-resolution facial landmark detection supervised by hallucination and transfer[J]. Elsevier BV, 2023. DOI:10.2139/ssrn.4518142.
[50] ZHUANG C, LI M, ZHANG K B.Multi-level landmark-guided deep network for face super-resolution[J]. Neural Networks, 2022, 152: 276-286.
[51] KAZEMI V, SULLIVAN J. One millisecond face alignment with an ensemble of regression trees[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2014: 1867-1874.
[52] KAJIWARA S. Driver-condition detection using a thermal imaging camera and neural networks[J]. International Journal of Automotive Technology, 2021, 22(6):1505-1515.
[53] CAI J, LIAO X, BAI J, et al. Face fatigue feature detection based on improved d-s model in complex scenes[J]. IEEE Access, 2023,11: 101790-101798.
[54] CHEN L, XIN G J, LIU Y, et al. Driver fatigue detection based on facial key points and LSTM[J]. Security and Communication Neworks, 2021(8): 1-9.
[55] CHEN C, GONG D, WANG H, et al. Learning spatial attention for face super-resolution[J].IEEE Transactions on Image Processing, 2021, 30: 1219-1231.
[56] JING Y, LU X, GAO S. 3D face recognition: a comprehensive survey[J]. Computational Visual Media, 2023, 9(4): 657-685.
[57] KIM J, JEONG H, CHO J, et al. Numerical approach to facial palsy using a novel registration method with 3D facial landmark[J]. Sensors, 2022, 22(17): 6636.
[58] DUTTA K, BHATTACHARJEE D, NASIPURI M. SpPCANet: a simple deep learning-based feature extraction approach for 3D face recognition[J]. Multimedia Tools and Applications, 2020, 79(2): 1-24.
[59] TOPSAKAL O, MURPHY J, CELIKOYAR M. Detecting facial landmarks on 3D models based on geometric PropErties[J]. IEEE Access, 2023,11: 25593-25603.
[60] LIU L Y, KE Z R, HUO J. Head pose estimation through keypoints matching between reconstructed 3D face model and 2D image[J]. Sensors, 2021, 21(5): 1841.
[61] BLANZ V, VETTER T. A morphable model for the synthesis of 3D faces[C]//Proceedings of SIGGRAPH Conference, 1999: 187-194.
[62] WU C Y, XU Q G. Synergy between 3DMM and 3D landmarks for accurate 3D facial geometry[C]//Proceedings of International Conference on 3D Vision, 2021: 453-463.
[63] ZHU X Y, GUO J Z. Towards fast, accurate and stable 3D dense face alignment[C]//Proceedings of the European Conference on Computer Vision, 2020: 152-168.
[64] XU Y, JUNG C. Face 2D to 3D reconstruction network based on head pose and 3D facial landmarks[C]//Proceedings of the International Conference on Visual Communications and Image Proessing, 2021.
[65] BULAT A, TZIMIROPOULOS G. How far are we from solving the 2D and 3D face alignment problem[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 1021-1030.
[66] BULAT A, TZIMIROPOULOS G. Two-stage convolutional part heatmap regression for the 1st 3D face alignment in the wild (3DFAW) challenge[J]. arXiv:1609.09545, 2016.
[67] FENG Y, WU F, SHAO X. Joint 3D face reconstruction and dense alignment with position map regression network [C]//Proceedings of the European Conference on Computer Vision, 2018: 557-574.
[68] RUAN Z, ZOU C, WU L. SADRNet: self-aligned dual face regression networks for robust 3D dense face alignment and reconstruction[J]. IEEE Transactions on Image Processing, 2021, 30: 5793-5806.
[69] 朱江平, 王睿珂, 段智涓, 等. 基于多尺度注意力机制相位展开的三维人脸建模[J].光学学报, 2022(1): 155-166.
ZHU J P, WANG R K, DUAN Z J, et al. Three-dimensional face modeling based on multi-scale attention phase unwrapping[J]. Acta Optica Sinica, 2022(1): 155-166.
[70] DI L, ZHANG J, LIANG J. Face alignment combined with shape constraints and Gaussian heatmap[J]. International Journal of Machine Learning and Cybernetics, 2023, 14(12): 4311-4324.
[71] SAGONAS C, PANTIC M, ZAFEIRIOU S, et al. 300 faces in-the-wild challenge: the first facial landmark localization challenge[C]//Proceedings of the 2013 IEEE International Conference on Computer Vision Workshops, 2013: 397-403.
[72] WU W, QIAN C, YANG S, et al. Look at boundary: a bound-ary-aware face alignment algorithm[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 2129-2138.
[73] BURGOS X P, PERONA P, DOLLAR P. Robust face landmark estimation under occlusion[C]//Proceedings of the IEEE International Conference on Computer Vision, 2013: 1513-1520.
[74] XUAN Y D, YI Y. Teacher supervises students how to learn from partially labeled images for facial landmark detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 783-792.
[75] SHEN J, ZAFEIRIOU S, KOSSAIFI J. The first facial landmark tracking in-the-wild challenge: benchmark and results[C]//Proceedings of the IEEE International Conference on Computer Vision Workshop, 2015: 50-58.
[76] ZHU X, LI S Z. Face alignment across large poses: a 3D solution[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 146-155.
[77] ZHU X, LEI Z. High-fifidelity pose and expression normalization for face recognition in the wild[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015: 787-796.
[78] FANELLI G, DANTONE M, GALL J. Random forests for real time 3D face analysis[J]. International Journal of Computer Vision, 2013, 101(3): 437-458.
[79] 谢扬. 基于回归方法的人脸关键点检测算法研究[D]. 广州: 华南理工大学, 2021.
XIE Y. Research on face keypoints detection algorithm based on regression methods[D]. Guangzhou: South China University of Technology, 2021.
[80] 刘欣. 轻量级人脸关键点检测算法研究[D]. 合肥: 安徽大学, 2020.
LIU X. Research on lightweight facial landmark detection[D]. Hefei: Anhui University, 2020.
[81] 张佳慧. 针对多姿态人脸的特征点定位算法研究[D]. 无锡: 江南大学, 2022.
ZHANG J H. Research on facial landmark localization algorithm for multi-pose face image[D]. Wuxi: Jiangnan University, 2022.
[82] 李丹阳. 基于深度学习的人脸关键点检测算法研究[D]. 武汉: 华中科技大学, 2021.
LI D Y. Facial landmark detection research based on deep learning[D]. Wuhan: Huazhong University of Science and Technology, 2021.
[83] 黄泄合. 基于深度学习的人脸关键点检测算法研究[D]. 北京: 北京邮电大学, 2020.
HUANG X H. Research on deep learning based facial land- mark detection algorithm[D]. Beijing: Beijing University of Posts and Telecommunications, 2020.
[84] 石胡森. 基于深度学习的人脸关键点检测方法研究[D]. 合肥: 中国科学技术大学, 2020.
SHI H S. Research on facial landmark detection methods based on deep learning[D]. Hefei: University of Science and Technology of China, 2020.