计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (21): 176-186.DOI: 10.3778/j.issn.1002-8331.2302-0022

• 图形图像处理 • 上一篇    下一篇

融合空间注意力的自适应安检违禁品检测方法

游玺,侯进,任东升,杨鹏熙,杜茂生   

  1. 1.西南交通大学 信息科学与技术学院 智能感知智慧运维实验室,成都 611756
    2.西南交通大学 综合交通大数据应用技术国家工程实验室,成都 611756
    3.西南交通大学 唐山研究院,河北 唐山 063000
  • 出版日期:2023-11-01 发布日期:2023-11-01

Adaptive Security Check Prohibited Items Detection Method with Fused Spatial Attention

YOU Xi, HOU Jin, REN Dongsheng, YANG Pengxi, DU Maosheng   

  1. 1.Laboratory of Intelligent Perception and Smart Operation & Maintenance, School of Information Science and Technology, Southwest Jiaotong University, Chengdu 611756, China
    2.National Engineering Laboratory of Integrated Transportation Big Data Application Technology, Southwest Jiaotong University, Chengdu 611756, China
    3.Tangshan Institute, Southwest Jiaotong University, Tangshan, Hebei 063000, China
  • Online:2023-11-01 Published:2023-11-01

摘要: 针对X光安检场景违禁品检测精度低,存在误检和漏检的问题,在Cascade R-CNN基础上,提出一种融合空间注意力的自适应安检违禁品检测方法XPIC R-CNN。在ResNet50中引入可形变卷积作为主干网络,自适应地学习不同尺寸的违禁品特征;结合可形变卷积的空间稀疏采样优势和自注意力机制强大的元素间关系建模能力,提出一种空间自适应注意力模块,有效地抑制复杂背景的噪音干扰;提出一种多尺度自适应候选区生成网络,使用语义特征去指导锚框的生成,提高候选框的质量以提升网络的召回率;在级联检测器中引入在线难例挖掘训练策略,解决正负样本不均衡和小样本训练困难的问题。实验结果表明,XPIC R-CNN在数据集SIXray_PI上的平均检测精度为94.5%,召回率为77.4%,比原始算法分别提升了3.2和8.2个百分点,最高漏检率仅有10%。

关键词: 违禁物品检测, Cascade R-CNN, 空间自适应注意力, 可形变卷积, 在线难例挖掘

Abstract: Aiming at the problem of low contraband detection accuracy, false detection and missed detection in X-ray security scenes, an adaptive security screening contraband detection method XPIC R-CNN with fused spatial attention is proposed based on Cascade R-CNN. Firstly, a deformable convolution is introduced as the backbone network in ResNet50 to adaptively learn contraband features of different sizes. Secondly, a spatially adaptive attention module is proposed to effectively suppress the interference of complex backgrounds by combining the spatially sparse sampling advantage of deformable convolution and the powerful inter-element relationship modeling capability of the self-attentive mechanism. Then, a multi-scale adaptive candidate area generation network is proposed to improve the quality of candidate anchor to improve the recall rate. The semantic features are used to guide the generation of anchor to improve the quality of candidate anchor to enhance the recall rate of the network. Finally, an online hard case mining training strategy is introduced in the cascade detection head to solve the problems of positive and negative sample imbalance and small sample training difficulties. The experimental results show that XPIC R-CNN achieves an average detection accuracy of 94.5% and a recall rate of 77.4% on the SIXray_PI dataset, which are respectively improved by 3.2 and 8.2 percentage points compared to the original algorithm. The highest missing detection rate is only 10%.

Key words: prohibited items detection, Cascade R-CNN, spatially adaptive attention, deformable convolution, online hard example mining