计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (23): 248-263.DOI: 10.3778/j.issn.1002-8331.2506-0362

• 图形图像处理 • 上一篇    下一篇

LGM-YOLOv11:融合多尺度注意力机制的水下目标检测模型

陈辉,虞永杰   

  1. 安徽理工大学 计算机科学与工程学院,安徽 淮南 232001
  • 出版日期:2025-12-01 发布日期:2025-12-01

LGM-YOLOv11: Underwater Object Detection Model Fusing Multi-Scale Attention Mechanism

CHEN Hui, YU Yongjie   

  1. School of Computer Science and Engineering, Anhui University of Science & Technology, Huainan, Anhui 232001, China
  • Online:2025-12-01 Published:2025-12-01

摘要: 水下图像在海洋生态环境监测、水下资源开发等应用中发挥着重要作用。然而,水下图像通常受到光散射、悬浮颗粒和颜色衰减等因素影响,导致图像呈现低对比度、边缘模糊和噪声干扰等特征,进而降低了水下目标检测的准确性和效率。针对这些挑战,提出了一种融合多尺度注意力机制的水下目标检测模型以提升水下环境物体的检测性能。引入拉普拉斯-高斯主干模块(LoGStem),代替YOLOv11主干网络的前两层卷积,增强了对水下图像的边缘和纹理细节的提取能力。提出门控激活卷积模块(GSConv)嵌入特征金字塔网络中,利用门控机制为每个空间位置和通道启用动态特征,增强了模型捕捉细节能力;提出了多尺度增强并行注意力模块(MSEPA),并将其集成到C3k2中,再通过多尺度特征融合和多重注意力机制的协同作用,从而增大感受野并增强特征表示;为了提高小目标定位的精度和稳定性,使用了Shape-NWD损失函数。在UTDAC、DUO、RUOD和水下垃圾数据集上的实验表明,所提出的方法相较于对比模型达到了最佳检测精度。

关键词: 水下目标检测, 多尺度注意力, YOLOv11, Shape-NWD

Abstract: Underwater images play a crucial role in applications such as marine ecological environment monitoring and underwater resource development. However, underwater images are often affected by factors such as light scattering, suspended particles, and color attenuation, resulting in low contrast, blurred edges, and noise interference, which in turn reduces the accuracy and efficiency of underwater target detection. To address these challenges, a waterborne target detection model integrating a multi-scale attention mechanism is proposed to enhance the detection performance of underwater objects. Firstly, the Laplacian-of-Gaussian stem (LoGStem) is introduced to replace the first two convolutional layers of the YOLOv11 backbone network, enhancing the extraction ability of edge and texture details in underwater images. Secondly, the gated activation convolution module (GSConv) is proposed and embedded in the feature pyramid network, using the gating mechanism to enable dynamic features for each spatial position and channel, thereby enhancing the model’s ability to capture details. Then, the multi-scale enhanced parallel attention module (MSEPA) is proposed and integrated into C3k2, and through the collaborative effect of multi-scale feature fusion and multiple attention mechanisms, the receptive field is enlarged and the feature representation is enhanced. Finally, to improve the accuracy and stability of small target localization, the Shape-NWD loss function is used. Experiments on the UTDAC, DUO, RUOD and underwater garbage datasets show that the proposed method achieves the best detection accuracy compared with the contrast models.

Key words: underwater object detection, multi-scale attention, YOLOv11, Shape-NWD