计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (12): 234-242.DOI: 10.3778/j.issn.1002-8331.2011-0136

• 图形图像处理 • 上一篇    下一篇

基于可逆金字塔和平衡注意力的工业裂缝分割

董红月,张兴忠,赵杰伦   

  1. 太原理工大学 软件学院,太原 030024
  • 出版日期:2022-06-15 发布日期:2022-06-15

Reversible Pyramid and Balanced Attention Network for Industrial Crack Segmentation

DONG Hongyue, ZHANG Xingzhong, ZHAO Jielun   

  1. College of Software, Taiyuan University of Technology, Taiyuan 030024, China
  • Online:2022-06-15 Published:2022-06-15

摘要: 针对目前工业裂缝分割算法存在细小裂缝易丢失、孤立噪点难消除的问题,提出一种基于可逆金字塔和平衡注意力的工业裂缝分割网络(reversible pyramid and balanced attention network,RPBAN)。提出可逆金字塔模块,在编码器与解码器之间引入特征金字塔和改进后的倒-特征金字塔,加深全局特征与细节特征的融合,从而提升细小裂缝检测性能;在解码阶段引入平衡注意力模块,将平衡特征作为引导信息,有效消除孤立噪点;在学习阶段选取Focal Loss作为损失函数,控制正负样本在训练中所占的权重,使得模型更专注于裂缝样本。通过在自建的输配电线路瓷瓶裂缝数据集InsulatorCrack和三个具有挑战性的公开裂缝数据集CFD、CrackTree200和AEL上进行验证和测试,实验表明与其他基准方法相比,RPBAN提升了细小裂缝检测效果,有效消除了孤立噪点,能够实现更高精度的语义分割。在四个数据集上IoU分别达到61.42%、58.36%、64.45%、53.44%,说明了RPBAN的有效性和通用性。

关键词: 可逆金字塔和平衡注意力网络(RPBAN), 可逆金字塔, 平衡注意力, 语义分割, 工业裂缝

Abstract: In order to solve the problems of small cracks easily lost and isolated noise difficult to eliminate in current industrial crack segmentation, an industrial crack segmentation network based on reversible pyramid and balanced attention (RPBAN) is proposed. Firstly, a reversible pyramid module is proposed, which introduces a feature pyramid and an improved inverted feature pyramid between encoder and decoder. By deepening the fusion of global features and detailed features, the performance of small crack detection is improved. Secondly, the balanced attention module is introduced in the decoding stage, and the balanced feature is used as the guide information to effectively remove isolated noise points. Finally, in the learning stage, the Focal Loss is selected as the loss function in training to control the weights of positive and negative samples, so that the model pays more attention to the crack samples. By verification and testing on the self-built InsulatorCrack dataset and three challenging public crack datasets CFD, CrackTree200 and AEL, the experimental results show that compared with other benchmark methods, RPBAN can achieve higher semantic segmentation accuracy, better detection effect on small cracks, and more effective elimination of isolated noise points. On the four data sets, IoU reaches 61.42%, 58.36%, 64.45% and 53.44% respectively, indicating the effectiveness and versatility of RPBAN.

Key words: reversible pyramid and balanced attention network(RPBAN), reversible pyramid, balanced attention, semantic segmentation, industrial crack