计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (7): 152-162.DOI: 10.3778/j.issn.1002-8331.2111-0459

• 模式识别与人工智能 • 上一篇    下一篇

多尺度特征融合与新型判别器的无监督分割

韩宗桓,刘名果,李珅,陈立家,田敏,兰天翔,梁倩   

  1. 1.河南大学 物理与电子学院,河南 开封 475004
    2.开封平煤新型炭材料科技有限公司,河南 开封 475004
  • 出版日期:2023-04-01 发布日期:2023-04-01

Unsupervised Segmentation Algorithm Based on Multi-Scale Feature Fusion and Novel Discriminator

HAN Zonghuan, LIU Mingguo, LI Shen, CHEN Lijia, TIAN Min, LAN Tianxiang, LIANG Qian   

  1. 1.School of Physics and Electronics, Henan University, Kaifeng, Henan 475004, China
    2.Kaifeng Pingmei New Carbon Materials Technology Co., Ltd., Kaifeng, Henan 475004, China
  • Online:2023-04-01 Published:2023-04-01

摘要: 工厂在智能化升级过程中,有很多应用场景需要用到语义分割。然而使用全监督语义分割方法需要耗费大量人力成本进行样本标注,所以研究无监督语义分割方法很有必要。针对本地某碳素厂石墨电极压印字符的语义分割问题,提出了一种无监督语义分割方法CycleGAN-Seg。结合跨层连接和空洞空间池化金字塔(ASPP)的思想,构建了新型多尺度特征融合生成器,加入了改进的注意力模块以提升网络性能。同时提出一种新的U形判别器对重构图像进行判别。在石墨电极表面压印字符数据集语义分割实验中,MIoU值可达70.81%,分割效果基本满足识别需要,有望在该工业场景中替代全监督学习方法,以节省人工标注成本,达到快速训练和部署的目的。

关键词: 多尺度特征融合, 注意力模块, 无监督分割, 表面压印字符

Abstract: In the process of intelligent upgrading of factories, there are many application scenarios that need to use semantic segmentation algorithms. However, the use of fully supervised semantic segmentation methods requires a lot of labor cost for sample labeling, so it is necessary to study unsupervised semantic segmentation methods. Aiming at the semantic segmentation of characters imprinted by graphite electrodes in a local carbon factory, an unsupervised semantic segmentation method CycleGAN-Seg is proposed. Combining the idea of cross-layer connection and atrous spatial pooling pyramid(ASPP), a novel multi-scale feature fusion generator is constructed, and an improved attention module is added to improve the network performance. At the same time, a new U-shaped discriminator is proposed to discriminate the reconstructed images. In the semantic segmentation experiment of the imprinted character dataset on the graphite electrode surface, the MIoU value can reach 70.81%. The semantic segmentation effect basically meets the recognition needs, and it is expected to replace the fully supervised learning method in this industrial scenario to save the cost of manual annotation and achieve the purpose of rapid training and deployment.

Key words: multi-scale feature fusion, attentional module, unsupervised segmentation, surface imprint characters