Adaptive Security Check Prohibited Items Detection Method with Fused Spatial Attention

doi:10.3778/j.issn.1002-8331.2302-0022

Abstract

Abstract: Aiming at the problem of low contraband detection accuracy, false detection and missed detection in X-ray security scenes, an adaptive security screening contraband detection method XPIC R-CNN with fused spatial attention is proposed based on Cascade R-CNN. Firstly, a deformable convolution is introduced as the backbone network in ResNet50 to adaptively learn contraband features of different sizes. Secondly, a spatially adaptive attention module is proposed to effectively suppress the interference of complex backgrounds by combining the spatially sparse sampling advantage of deformable convolution and the powerful inter-element relationship modeling capability of the self-attentive mechanism. Then, a multi-scale adaptive candidate area generation network is proposed to improve the quality of candidate anchor to improve the recall rate. The semantic features are used to guide the generation of anchor to improve the quality of candidate anchor to enhance the recall rate of the network. Finally, an online hard case mining training strategy is introduced in the cascade detection head to solve the problems of positive and negative sample imbalance and small sample training difficulties. The experimental results show that XPIC R-CNN achieves an average detection accuracy of 94.5% and a recall rate of 77.4% on the SIXray_PI dataset, which are respectively improved by 3.2 and 8.2 percentage points compared to the original algorithm. The highest missing detection rate is only 10%.

Key words: prohibited items detection, Cascade R-CNN, spatially adaptive attention, deformable convolution, online hard example mining

摘要： 针对X光安检场景违禁品检测精度低，存在误检和漏检的问题，在Cascade R-CNN基础上，提出一种融合空间注意力的自适应安检违禁品检测方法XPIC R-CNN。在ResNet50中引入可形变卷积作为主干网络，自适应地学习不同尺寸的违禁品特征；结合可形变卷积的空间稀疏采样优势和自注意力机制强大的元素间关系建模能力，提出一种空间自适应注意力模块，有效地抑制复杂背景的噪音干扰；提出一种多尺度自适应候选区生成网络，使用语义特征去指导锚框的生成，提高候选框的质量以提升网络的召回率；在级联检测器中引入在线难例挖掘训练策略，解决正负样本不均衡和小样本训练困难的问题。实验结果表明，XPIC R-CNN在数据集SIXray_PI上的平均检测精度为94.5%，召回率为77.4%，比原始算法分别提升了3.2和8.2个百分点，最高漏检率仅有10%。

关键词: 违禁物品检测, Cascade R-CNN, 空间自适应注意力, 可形变卷积, 在线难例挖掘

YOU Xi, HOU Jin, REN Dongsheng, YANG Pengxi, DU Maosheng. Adaptive Security Check Prohibited Items Detection Method with Fused Spatial Attention[J]. Computer Engineering and Applications, 2023, 59(21): 176-186.

游玺, 侯进, 任东升, 杨鹏熙, 杜茂生. 融合空间注意力的自适应安检违禁品检测方法[J]. 计算机工程与应用, 2023, 59(21): 176-186.

References

[1] 邵延华，张铎，楚红雨，等.基于深度学习的YOLO目标检测综述[J].电子与信息学报，2022，44（10）：3697-3708.
SHAO Y H，ZHANG D，CHU H Y，et al.A review of YOLO object detection based on deep learning[J].Journal of Electronics & Information Technology，2022，44（10）：3697-3708.
[2] 钱伍，王国中，李国平.改进YOLOv5的交通灯实时检测鲁棒算法[J].计算机科学与探索，2022，16（1）：231-241.
QIAN W，WANG G Z，LI G P.Improved YOLOv5 traffic light real-time detection robust algorithm[J].Journal of Frontiers of Computer Science and Technology，2022，16（1）：231-241.
[3] JOCHER G.YOLOv5[EB/OL].（2020-05-18）[2022-02-22].https：//github.com/ultralytics/yolov5.
[4] 宋艳艳，谭励，马子豪，等.改进YOLOV3算法的视频目标检测[J].计算机科学与探索，2021，15（1）：163-172.
SONG Y Y，TAN L，MA Z H，et al.Video target detection based on improved YOLOV3 algorithm[J].Journal of Frontiers of Computer Science and Technology，2021，15（1）：163-172.
[5] REDMON J，FARHADI A.Yolov3：an incremental improvement[J].arXiv：1804.02767，2018.
[6] 梁添汾，张南峰，张艳喜，等.违禁品X光图像检测技术应用研究进展综述[J].计算机工程与应用，2021，57（16）：74-82.
LIANG T F，ZHANG N F，ZHANG Y X，et al.Summary of research progress on application of prohibited item detection in X-ray images[J].Computer Engineering and Applications，2021，57（16）：74-82.
[7] REN S，HE K，GIRSHICK R，et al.Faster R-CNN：towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2016，39（6）：1137-1149.
[8] 吴海滨，魏喜盈，刘美红，等.结合空洞卷积和迁移学习改进YOLOv4的X光安检危险品检测[J].中国光学，2021，14（6）：1417-1425.
WU H B，WEI X Y，LIU M H，et al.Improved YOLOv4 for dangerous goods detection in X-ray inspection combined with atrous convolution and transfer learning[J].Chinese Optics，2021，14（6）：1417-1425.
[9] BOCHKOVSKIY A，WANG C Y，LIAO H Y M.Yolov4：optimal speed and accuracy of object detection[J].arXiv：2004.10934，2020.
[10] 董乙杉，李兆鑫，郭靖圆，等.一种改进YOLOv5的X光违禁品检测模型[J].激光与光电子学进展，2023，60（4）：0415005.
DONG Y S，LI Z X，GUO J Y，et al.An improved YOLOv5 model for X-ray prohibited items detection[J].Laster & Optoelectronics Progress，2023，60（4）：0415005.
[11] 康佳楠，张良.多通道区域建议的多尺度X光安检图像检测[J].计算机工程与应用，2022，58（1）：224-231.
KANG J N，ZHANG L.Multi-scale X-ray security inspection image detection with multi-channel region proposal[J].Computer Engineering and Applications，2022，58（1）：224-231.
[12] 吉祥凌，吴军，易见兵，等.基于深度学习的管制物品自动检测算法研究[J].激光与光电子学进展，2019，56（18）：76-86.
JI X L，WU J，YI J B，et al.Automatic detection algorithm for controlled items based on deep learning[J].Laser & Optoelectronics Progress，2019，56（18）：76-86.
[13] LIU W，ANGUELOV D，ERHAN D，et al.Ssd：single shot multibox detector[C]//European Conference on Computer Vision.Cham，Amsterdam：Springer，2016：21-37.
[14] 张友康，苏志刚，张海刚，等.X光安检图像多尺度违禁品检测[J].信号处理，2020，36（7）：1096-1106.
ZHANG Y K，SU Z G，ZHANG H G，et al.Multi-scale prohibited item detection in X-ray security image[J].Journal of Signal Processing，2020，36（7）：1096-1106.
[15] 郭守向，张良.Yolo-C：基于单阶段网络的 X 光图像违禁品检测[J].激光与光电子学进展，2021，58（8）：0810003.GUO S X，ZHANG L.Yolo-C：one-stage network for prohibited items detection within X-ray images[J].Laser & Optoelectronics Progress，2021，58（8）：0810003.
[16] CAI Z，VASCONCELOS N.Cascade R-CNN：delving into high quality object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition（CVPR）.Salt Lake City，UT，USA：IEEE Press，2018：6154-6162.
[17] DAI J F，QI H Z，XIONG Y W，et al.Deformable convolutional networks[C]//Proceedings of the IEEE International Conference on Computer Vision（ICCV）.Venice，Italy：IEEE Press，2017：764-773.
[18] WANG J Q，CHEN K，YANG S，et al.Region proposal by guided anchoring[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition（CVPR）.Long Beach，CA，USA：IEEE Press，2019：2965-2974.
[19] SHRIVASTAVA A，GUPTA A，GIRSHICK R.Training region-based object detectors with online hard example mining[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition（CVPR）.Las Vegas，NV，USA：IEEE Press，2016：761-769.
[20] 李松江，吴宁，王鹏，等.基于改进Cascade RCNN的车辆目标检测方法[J].计算机工程与应用，2021，57（5）：123-130.
LI S J，WU N，WANG P，et al.Vehicle target detection method based on improved cascade RCNN[J].Computer Engineering and Applications，2021，57（5）：123-130.
[21] HE K M，ZHANG X Y，REN S Q，et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition（CVPR）.Las Vegas，NV，USA：IEEE Press，2016：770-778.
[22] LIN T Y，DOLLáR P，GIRSHICK R，et al.Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2017：2117-2125.
[23] LIU S，QI L，QIN H F，et al.Path aggregation network for instance segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition（CVPR）.Salt Lake City，UT，USA：IEEE Press，2018：8759-8768.
[24] NEUBECK A，VAN GOOL L.Efficient non-maximum suppression[C]//18th International Conference on Pattern Recognition（ICPR’06），2006：850-855.
[25] MERY D，RIFFO V，ZSCHERPEL U，et al.GDXray：the database of X-ray images for nondestructive testing[J].Journal of Nondestructive Evaluation，2015，34（4）：1-12.
[26] WEI Y，TAO R，WU Z，et al.Occluded prohibited itemsdetection：an x-ray security inspection benchmark and de-occlusion attention module[C]//Proceedings of the 28th ACM International Conference on Multimedia，2020：138-146.
[27] MIAO C J，XIE L G，WAN F，et al.SIXray：a large-scale security inspection x-ray benchmark for prohibited item discovery in overlapping images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition（CVPR）.Long Beach，CA，USA：IEEE Press，2019：2119-2128.
[28] CHEN K，WANG J Q，PANG J M，et al.MMDetection：open MMLab detection toolbox and benchmark[J].arXiv：1906.07155，2019.
[29] LIN T Y，MAIRE M，BELONGIE S，et al.Microsoft coco：common objects in context[C]//European Conference on computer vision（ECCV）.Cham：Springer，2014：740-755.
[30] WOO S，PARK J，LEE J Y，et al.Cbam：convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision（ECCV），2018：3-19.
[31] VASWANI A，SHAZEER N，PARMAR N，et al.Attention is all you need[C]//Proceedings of the 31st Conference on Neural Information Processing Systems.Washington D C，USA：IEEE Press，2017：5998-6010.
[32] SELVARAJU R R，COGSWELL M，DAS A，et al.Grad-CAM：visual explanations from deep networks via gradient-based localization[C]//Proceeding of 2017 IEEE International Conference on Computer Vision（ICCV）.Venice，Italy：IEEE，2017：618-626.
[33] LIN T Y，GOYAL P，GIRSHICK R，et al.Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision（CVPR）.Las Vegas，NV，USA：IEEE Press，2017：2980-2988.
[34] 李晨，张辉，张邹铨，等.融合多尺度特征与全局上下文信息的X光违禁物品检测[J].中国图象图形学报，2022，27（10）：3043-3057.
LI C，ZHANG H，ZHANG Z Q，et al.Integrated multi-scale features and global context in X-ray detection for prohibited items[J].Journal of Image and Graphics，2022，27（10）：3043-3057.
[35] 成浪，敬超.基于改进YOLOv7的X光图像旋转目标检测[J/OL].图学学报：1-12[2023-02-22].http：//kns.cnki.net/kcms/detail/10.1034.T.20221109.0956.002.html.
CHENG L，JING C.X-ray image rotating object detection based on improved YOLOv7[J/OL].Journal of Graphics：1-12[2023-02-22].http：//kns.cnki.net/kcms/detail/10.1034.T.20221109.0956.002.html.