Algorithm for Portrait Segmentation Combined with MobileNetv2 and Attention Mechanism

doi:10.3778/j.issn.1002-8331.2106-0334

Abstract

Abstract: As for low precision and efficiency in portrait segmentation, an algorithm for portrait segmentation combined with MobileNetv2 and attention mechanism is proposed to achieve the portrait segmentation. With keeping the encoder-decoder of U-typed network , MobileNetv2 is used as the backbone of the network and streamline the upsampling process, it can reduce the parameters of the network. It is helpful for transfer and network training. The network with attention mechanism can learn portrait features more effectively, and the mixed loss is beneficial to the classification of difficult pixels of portrait edges. A portrait bust can be selected as the input of the model, and the corresponding image mask can be produced by the network. The proposed algorithm is tested on Human_Matting dataset and EG1800 dataset. The results show that the accuracy of the proposed algorithm is 98.3%（Matting） and 97.8%（EG1800）, which is higher than PortraitNet（96.3%（Matting） and 95.8%（EG1800）） and DeepLabv3+（96.8%（Matting） and 96.4%（EG1800））. The algorithm can clearly separate the target person from the background. The proposed algorithm’s IOU can reach to 98.6%（Matting） and 98.2%（EG1800）, which can be used in lightweight applications and provides a new research idea for portrait segmentation.

Key words: portrait segmentation, MobileNetv2, encoder-decoder, attention mechanism, mixed loss

摘要： 针对人像分割精度不高、效率不佳的问题，提出一种融合MobileNetv2和注意力机制的轻量级人像分割算法，以实现对人像半身图进行分割。在编码器-解码器的U型网络结构的基础上，通过将MobileNetv2作为骨干网络，精简上采样过程，有效地减少了网络的参数量，有助于网络的迁移和训练。融合注意力机制的网络结构可更有效地学习人像特征，同时引进混合损失函数，有利于人像边缘像素点分类。该网络结构可选用人像半身图作为输入，并输出对应的图像掩膜。在Human_Matting和EG1800公开数据集上进行了实验，结果表明该算法精度分别达98.3%（Matting）、97.8%（EG1800），相较于PortraitNet预测96.3%（Matting）、95.8%（EG1800）的准确度和DeepLabv3+网络的96.8%（Matting）、96.4%（EG1800）准确度有明显提升，可以清晰地将目标人物和背景分离开。算法IOU指标可达98.6%（Matting）、98.2%（EG1800），在实验平台上分割测试集每张图片平均时间约0.015?s，可应用于轻量化场景中，为场景人像分割提供新的理论基础和研究思路。

关键词: 人像分割, MobileNetv2, 编码器-解码器, 注意力机制, 混合损失函数

WANG Xin, WANG Meili, BIAN Dangwei. Algorithm for Portrait Segmentation Combined with MobileNetv2 and Attention Mechanism[J]. Computer Engineering and Applications, 2022, 58(7): 220-228.

王欣, 王美丽, 边党伟. 融合MobileNetv2和注意力机制的轻量级人像分割算法[J]. 计算机工程与应用, 2022, 58(7): 220-228.

References

[1] 魏程峰，董洪伟，徐小春.基于空间变换的属性可编辑的人体图像合成[J].计算机工程与应用，2022，58（6）：219-226.
WEI C F，DONG H W，XU X C.Attribute editable person image synthesis based on spatial transformation[J].Computer Engineering and Applications，2022，58（6）：219-226.
[2] 杜星悦，董洪伟，杨振.基于深度网络的人脸区域分割方法[J].计算机工程与应用，2019，55（8）：171-174.
DU X Y，DONG H W，YANG Z.Face region segmentation method based on deep network[J].Computer Engineering and Applications，2019，55（8）：171-174.
[3] 崔丽群，黄殿平，宋晓.基于云模型鱼群算法的多阈值图像分割研究[J].计算机工程与应用，2017，53（6）：204-208.
CUI L Q，HUANG D P，SONG X.Multi-threshold method for image segmentation based on cloud modelartificial fish swarm algorithm[J].Computer Engineering and Applications，2017，53（6）：204-208.
[4] 邹小林，冯国灿.基于正则割（Ncut）的多阈值图像分割方法[J].计算机工程与应用，2012，48（19）：174-178.
ZOU X L，FENG G C.Image segmentation of multilevel thresholding method using Ncut[J].Computer Engineering and Applications，2012，48（19）：174-178.
[5] 李中健，杜娟，郭璐.将Otsu用于多阈值彩色图像分割的方法及优化[J].计算机工程与应用，2010，46（11）：176-178.
LI Z J，DU J，GUO L.Multi-threshhold segmentation and optimization based on Otsu in color image[J].Computer Engineering and Applications，2010，46（11）：176-178.
[6] LAKSHMI S，SANKARANARAYANAN D V.A study of edge detection techniques for segmentation computing approaches[J].International Journal of Computer Applications，2010（1）：35-41.
[7] ARBELAEZ P，MAIRE M，FOWLLKES C，et al.Contour detection and hierarchical image segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2011，33（5）：898-916.
[8] SHEN X，HERTZMANN A，JIA J，et al.Automatic portrait segmentation for image stylization[M].[S.l.]：John Wiley & Sons，Ltd，2016：93-102.
[9] 何冀军，申远，郭玉堂，等.用于人像提取及半身像合成的生成对抗网络算法[J].计算机辅助设计与图形学学报，2020，32（4）：599-605.
HE J，SHEN Y，GUO Y T，et al.Generative adversarial network algorithm for portrait extraction and bust synthesis[J].Journal of Computer-Aided Design & Computer Graphics，2020，32（4）：599-605.
[10] SHEN X，TAO X，GAO H，et al.Deep automatic portrait matting[C]//Proceedings of European Conference on Computer Vision，2016.
[11] SANDLET M，HOWARD A，ZHU M，et al.MobileNetV2：Inverted residuals and linear bottlenecks[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition（CVPR），2018.
[12] 张航，王雅萍，耿秀娟，等.等强度婴儿脑MR图像分割的深度学习方法综述[J].中国图象图形学报，2020，25（10）：2068-2078.
ZHANG H，WANG Y P，GENG X J，et al.A review of deep learning methods for infant brain MR image segmentation with constant intensity[J].Journal of Image and Graphics，2020，25（10）：2068-2078.
[13] BADRINARAYANAN V，KENDALL A，CIPOLLA R.SegNet：A deep convolutional encoder-decoder architecture for image segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2017，39（12）：2481-2495.
[14] IOFFE S，SZEGEDY C.Batch normalization：Accelerating deep network training by reducing internal covariate shift[C]//Proceedings of International Conference on Machine Learning，2015：448-456.
[15] HINTON G E，SRIVASTAVA N，KRIZHEVSKY A，et al.Improving neural networks by preventing co-adaptation of feature detectors[J].Computer Science，2012，3（4）：212-223.
[16] HU J，SHEN L，SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2018：7132-7141.
[17] MILLETARI F，NAVAB N，SHMADI S A.V-Net：Fully convolutional neural networks for volumetric medical image segmentation[C]//Prceedings of 2016 Fourth International Conference on 3D Vision（3DV），2016.
[18] LIN T Y，GOYAL P，GIRSHICK R，et al.Focal loss for dense object detection[C]//Prceedings of 2017 IEEE International Conference on Computer Vision（ICCV），2017：2999-3007.
[19] BOKHOVKIN A，BURNAEV E.Boundary loss for remote sensing imagery semantic segmentation[J].arXiv：1905. 07852，2019.
[20] RONNEBERGER O，FISCHER P，BROX T.U-net：Convolutional networks for biomedical image segmen-tation[C]//Proceedings of the International Conference on Medical Image Computing and Computer Assisted Intervention，2015：234-241.
[21] KINGMA D，BA J.Adam：A method for stochastic optimization[J].arXiv：1412.6980，2014.
[22] CHEN Q，GE T，XU Y，et al.Semantic human matting[C]//Proceedings of the 26th ACM International Conference on Multimedia.2018：618-626.
[23] RAHMAN M A，WANG Y.Optimizing intersection-over-union in deep neural networks for image segmentation[C]//Proceedings of International Symposium on Visual Computing，2016：234-244.
[24] 周鹏，姚剑敏，林志贤，等.融合注意力机制的移动端人像分割网络[J].液晶与显示，2020，35（6）：547-554.
ZHOU P，YAO J M，LIN Z X，et al.Mobile portrait segmentation network integrating attention mechanism[J].Chinese Journal of Liquid Crystal and Display，2020，35（6）：547-554.
[25] 杨坚伟，严群，姚剑敏，等.基于深度神经网络的移动端人像分割[J].计算机应用，2020，40（12）：3644-3650.
YANG J W，YAN Q，YAO J M，et al.Portrait segmentation on mobile devices based on deep neural network[J].Joural of Computer Applications，2020，40（12）：3644-3650.
[26] LONG J，SHELHAMER E，DARRELL T.Fully convolutional networks for semantic segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2017，39（4）：640-651.
[27] PASZKE A，CHAURASIA A，KIM S，et al.ENet：A deep neural network architecture for real-time semantic segmentation[J].arXiv：1606.02147，2016.
[28] TRAN M Q，DAVID G，WONKI J.FusionNet：A deep fully residualconvolutional neural network for image segmentation in connectomics[J].arXiv：1612.05360，2016.
[29] ZHANG S H，DONG X，LI H，et al.PortraitNet：Real-time portrait segmentation network for mobile device[J].Computers & Graphics，2019，80：104-113.
[30] CHEN L C，PAPANDREOU G，KOKKINOS L，et al.DeepLab：Semantic image segmentation with deep convolutional nets，atrous convolution，and fully connected CRFs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2018，40（4）：834-848.