计算机工程与应用 ›› 2026, Vol. 62 ›› Issue (8): 105-115.DOI: 10.3778/j.issn.1002-8331.2505-0255

• 目标检测专题 • 上一篇    下一篇

RGE-YOLO:下采样与注意力机制融合的缺陷检测方法

汪兴1,韩志科2+,黄晓辉1,刘鹏1   

  1. 1.华东交通大学 信息与软件工程学院,南昌 330013
    2.浙大城市学院 计算机与计算科学学院,杭州 310015
    + 通信作者 E-mail:hanzk@hzcu.edu.cn
  • 收稿日期:2025-05-21 修回日期:2025-09-10 在线发布日期:2026-04-15 出版日期:2026-04-15
  • 基金资助:
    江西省自然科学基金重点项目(20242BAB26023)。

RGE-YOLO: Downsampling-Attention Fusion Approach for Defect Detection

WANG Xing1, HAN Zhike2+, HUANG Xiaohui1, LIU Peng1   

  1. 1.School of Information and Software Engineering, East China Jiaotong University, Nanchang 330013, China
    2.School of Computer and Computing Science, Hangzhou City University, Hangzhou 310015, China
    + Corresponding author E-mail:hanzk@hzcu.edu.cn
  • Received:2025-05-21 Revised:2025-09-10 Online:2026-04-15 Published:2026-04-15

摘要: 针对工业缺陷检测任务中精度与效率难以兼顾的问题,构建了一种基于YOLOv8n的改进缺陷检测算法RGE-YOLO。为减少传统卷积下采样导致的小目标特征丢失,该算法引入鲁棒特征下采样(robust feature downsampling,RFD)模块,以多尺度特征融合机制增强缺陷的表征能力。同时,为提升模型对缺陷区域的关注度,设计了轻量化注意力模块(ghost-simple attention module,GhostSAM),其融合了Ghost卷积的高效特征提取能力与无参数注意力的空间激活特性,并结合门控策略实现通道语义的自适应调节。此外,采用高效交并比(efficient intersection over union,EIoU)损失函数优化边界框回归过程。在NEU-DET钢材缺陷数据集上的实验结果表明,与基线算法YOLOv8n相比,RGE-YOLO算法的精确度、召回率和mAP@0.5分别提升了8.3、2.7和3.8个百分点,另外改进后网络的检测速度为131帧/s。实验结果表明,该算法实现了检测精度与效率的有效平衡。

关键词: 缺陷检测, YOLOv8, 注意力机制, 多尺度特征融合

Abstract: To address the trade-off between accuracy and efficiency in industrial defect detection, an improved algorithm, RGE-YOLO, is proposed based on YOLOv8n. The algorithm introduces a robust feature downsampling (RFD) module to mitigate the small-target feature loss caused by conventional convolutional downsampling, enhancing feature representation through a multi-scale fusion mechanism. Concurrently, a lightweight attention module, the ghost-simple attention module (GhostSAM), is designed to improve the model??s focus on defect regions. It integrates the efficient feature extraction of Ghost convolution with the spatial activation of a parameter-free attention mechanism, and incorporates a gating strategy for adaptive channel-wise semantic adjustment. Furthermore, the bounding box regression process is optimized using the efficient intersection over union (EIoU) loss function. Experimental results on the NEU-DET steel defect dataset show that, compared to the baseline YOLOv8n, RGE-YOLO increases precision, recall, and mAP@0.5 by 8.3, 2.7, and 3.8 percentage points, respectively, while achieving a detection speed of 131 frames per second. These results confirm that the algorithm effectively balances detection accuracy and efficiency.

Key words: defect detection, YOLOv8, attention mechanism, multi-scale feature fusion