计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (12): 333-343.DOI: 10.3778/j.issn.1002-8331.2407-0368

• 工程与应用 • 上一篇    下一篇

融合扩散模型的生成式零样本钢表面缺陷检测

季瑞瑞,杨思凡,华羽垚,耿屹,白晨羲   

  1. 西安理工大学 自动化与信息工程学院,西安 710048
  • 出版日期:2025-06-15 发布日期:2025-06-13

Generative Zero-Shot Steel Surface Defect Detection Integrating Diffusion Model

JI Ruirui, YANG Sifan, HUA Yuyao, GENG Yi, BAI Chenxi   

  1. School of Automation and Information Engineering, Xi’an University of Technology, Xi’an  710048, China
  • Online:2025-06-15 Published:2025-06-13

摘要: 针对生成式零样本目标检测模型难以应对复杂场景下的钢材表面缺陷检测,存在语义混淆和鲁棒性低的问题,提出一种融合扩散模型的生成式零样本钢材表面缺陷检测模型。设计多模态缺陷特征对齐模块,通过监督对比学习、缺陷特征对齐和语义一致性重建,使生成器生成的缺陷特征与原始语义信息充分对齐,提高生成模型的鲁棒性;引入缺陷特征去噪扩散模块,通过逐步添加、去除噪声来生成多样化的特征表征,并筛选出具有代表性的生成缺陷特征。将得到的生成缺陷特征用于更新缺陷检测网络的分类器,实现零样本钢材表面缺陷检测。通过在NEU和GC10两个钢材表面缺陷数据集上的实验结果显示,零样本检测设置下,检测精度相较于基线模型分别提升11.5和17.4个百分点;广义零样本检测设置下,调和平均值分别提升3.0和9.8个百分点,有效提升了模型在复杂场景下的钢材表面缺陷检测能力;可视化结果表明,模型能够生成分离特征明显的未见缺陷特征,缓解了语义混淆问题;此外,与目前先进的零样本目标检测模型相比,该模型在钢材表面缺陷检测中表现出了更高的准确率和鲁棒性。

关键词: 缺陷检测, 零样本学习, 生成式模型, 语义对齐, 扩散模型

Abstract: To address the challenges of semantic confusion and low robustness in generative zero-shot target detection models for complex scenes in steel surface defect detection, a generative zero-shot steel surface defect detection model incorporating a diffusion model is proposed. A multi-modal defect feature alignment module is designed to fully align the defect features generated by the generator with the original semantic information through supervised contrastive learning, defect feature alignment, and semantic consistency reconstruction, thereby improving the robustness of the generative model. A defect feature denoising diffusion module is introduced to generate diverse feature representations by gradually adding and removing noise, and representative generated defect features are selected. The obtained generated defect features are used to update the classifier of the defect detection network, achieving zero-shot steel surface defect detection. Experiments conducted on the NEU and GC10 steel surface defect datasets show that under the zero-shot detection setting, the detection accuracy is improved by 11.5  and 17.4 percentage points, respectively, compared to baseline models. Under the generalized zero-shot detection setting, the harmonic mean is improved by 3.0 and 9.8 percentage points, respectively, effectively enhancing the  capability of the model in detecting steel surface defects in complex scenes. Visualization results indicate that the model can generate unseen defect features with distinct characteristics, alleviating semantic confusion. Additionally, compared with current advanced zero-shot detection models, this model exhibits higher accuracy and robustness in steel surface defect detection.

Key words: defect detection, zero-shot learning, generative model, semantic alignment, diffusion model