计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (7): 153-164.DOI: 10.3778/j.issn.1002-8331.2410-0482

• YOLO改进及应用专题 • 上一篇    下一篇

基于改进YOLOv10的喷码微小字符精确定位算法

操振,余朝刚,靳胜洁,王帅鹏,朱文良   

  1. 上海工程技术大学 城市轨道交通学院,上海 201620
  • 出版日期:2025-04-01 发布日期:2025-04-01

Precise Localization Algorithm for Inkjet Small Characters Based on Improved YOLOv10

CAO Zhen, YU Chaogang, JIN Shengjie, WANG Shuaipeng, ZHU Wenliang   

  1. School of Urban Railway Transportation, Shanghai University of Engineering Science, Shanghai 201620, China
  • Online:2025-04-01 Published:2025-04-01

摘要: 为解决自动化生产领域物品包装表面喷码字符因微小尺寸及特征模糊而导致的检测定位精度差问题,将YOLOv10n作为基线网络提出一种针对特征模糊小目标检测的精确定位算法(YOLO-DLW)。使用细节信息提取卷积(detail information extraction convolution,DIEConv)模块替换基线网络中所有的跨步卷积模块,避免其导致的细节特征丢失问题,提高网络对小目标特征的提取能力。引入低级特征融合检测层,减少基线网络在下采样过程中造成的小目标特征损失。在颈部网络采用加权混合融合金字塔网络(weighted hybrid fusion pyramid network,WHFPN)结构,并结合内容引导注意力(content-guided attention,CGA)机制,有效提升特征层间的信息融合效率和网络对关键信息的提取能力。与基线模型相比,YOLO-DLW算法应用在编织袋小目标字符定位数据集上,准确率、召回率、mAP@0.5和mAP@0.5:0.95分别提高了7.3、8.2、3.9和3.3个百分点,有效解决基线模型对字符区域的误检和漏检问题。

关键词: 小目标检测, 精确定位, YOLOv10n, 细节信息提取卷积, 加权混合融合金字塔, 内容引导注意力

Abstract: To address the issue of poor detection and localization accuracy of inkjet characters on product packaging surfaces in automated production, due to their small size and ambiguous features, a precise localization algorithm (YOLO-DLW) for detecting small targets with ambiguous features is proposed, employing YOLOv10n as the baseline network. All strided convolution modules in the baseline network are replaced with detail information extraction convolution (DIEConv) modules to prevent the loss of detail features and enhance the network’s capability to extract features from small targets. A low-level feature fusion detection layer is introduced to reduce the loss of small target features caused by the downsampling process in the baseline network. In the neck network, a weighted hybrid fusion pyramid network (WHFPN) structure is employed, combined with the content-guided attention (CGA) mechanism, to effectively enhance the efficiency of information fusion between feature layers and the network’s ability to extract critical information. Compared to the baseline model, the YOLO-DLW algorithm, when applied to the woven bag small target character localization dataset, shows improvements of 7.3, 8.2, 3.9, and 3.3 percentage points in accuracy, recall rate, mAP@0.5, and mAP@0.5:0.95 respectively, effectively resolving the issues of false positives and missed detections in the character area of the baseline model.

Key words: small targets detection, precise localization, YOLOv10n, detail information extraction convolution, weighted hybrid fusion pyramid, content-guided attention