改进YOLOv8的轻量化水下生物检测模型

doi:10.3778/j.issn.1002-8331.2408-0411

摘要/Abstract

摘要： 在复杂自然环境下高效探测水下生物资源对中国渔业具有重要意义，为了解决YOLO系列针对复杂的水下环境的检测能力较弱且模型泛化性不足等问题，提出一种基于改进YOLOv8n的水下生物目标检测的方法SGDC-YOLOv8。将深度监督的思想融入检测头，利用共享感受野注意力卷积提高检测精度的同时优化感受野，引入额外的监督损失函数来实现参数共享的高效检测头；为降低计算成本和参数量，设计了轻量化门控正则单元部分卷积模块为模型减负；针对水下生物目标的特征容易模糊或丢失的问题，提出浅层混合池下采样模块和深层最大池下采样模块，以优化多尺度特征融合，并保证关键数据的准确性和完整性；在网络中加入卷积与注意力融合CAFM模块来增强全局和局部的特征建模。在公开数据集DUO上的实验结果表明，相比于基线模型YOLOv8n，SGDC-YOLOv8在mAP@50上提升2.5个百分点，在mAP@50-95提升1.8个百分点，参数量和计算量分别降低14.62%和15.85%，FPS提升至146.2，相比于其他主流目标检测模型表现效果也最佳。

关键词: 水下目标检测, YOLOv8, 轻量化, 深度监督

Abstract: Efficient detection of underwater biological resources in complex natural environments is of great significance to China’s fisheries. In order to solve the problems of weak detection ability and insufficient model generalization of YOLO series for complex underwater environments, a method for underwater biological target detection based on improved YOLOv8n, SGDC-YOLOv8, is proposed. Firstly, the idea of deep supervision is integrated into the detection head, using shared receptive field attention convolution to improve detection accuracy while optimizing the receptive field. An additional supervised loss function is introduced to achieve efficient parameter sharing in the detection head. Secondly, in order to reduce computational costs and parameter count, a lightweight gated regularization unit convolution module is designed to reduce the burden on the model. Aiming at the problem of easily blurred or lost features of underwater biological targets, shallow mixed pool downsampling module and deep maximum pool downsampling module are proposed to optimize multi-scale feature fusion and ensure the accuracy and completeness of key data. Finally, a convolutional and attention fusion CAFM module is added to the network to enhance global and local feature modeling. The experimental results on the publicly available dataset DUO show that compared to the baseline model YOLOv8n, SGDC-YOLOv8 increases by 2.5?percentage points at mAP@50, and 1.8 percentage points in mAP@50-95. It results in a decrease of 14.62% in parameter count and 15.85% in computational complexity. FPS increases to 146.2, which is also the best performance compared to other mainstream object detection models.

Key words: underwater target detection, YOLOv8, lightweight, depth supervision

闵锋, 张雨薇, 刘煜晖, 刘彪. 改进YOLOv8的轻量化水下生物检测模型[J]. 计算机工程与应用, 2025, 61(6): 96-105.

MIN Feng, ZHANG Yuwei, LIU Yuhui, LIU Biao. Improving Lightweight Underwater Biological Detection Model of YOLOv8[J]. Computer Engineering and Applications, 2025, 61(6): 96-105.

参考文献

[1] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014: 580-587.
[2] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.
[3] TAN M, PANG R, LE Q V. EfficientDet: scalable and efficient object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2020: 10778-10787.
[4] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 2980-2988.
[5] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]//Proceedings of the European Conference on Computer Vision. Cham: Springer, 2016: 21-37
[6] CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]//Proceedings of the European Conference on Computer Vision. Cham: Springer, 2020: 213-229.
[7] REN S, HE K, GIRSHICK R, et al. Faster RCNN: towards realtime object detection with region proposal networks[C]//Proceedings of Advances in Neural Information Processing Systems, 2015: 91-99.
[8] HE K M, GKIOXARI G, DOLLAR P, et al. Mask R-CNN[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2980-2988.
[9] 辛世澳, 葛海波, 袁昊, 等. 改进YOLOv7的轻量化水下目标检测算法[J]. 计算机工程与应用, 2024, 60(3): 88-99.
XIN S A, GE H B, YUAN H, et al. Improved lightweight underwater target detection algorithm of YOLOv7[J]. Computer Engineering and Applications, 2024, 60(3): 88-99.
[10] 梁秀满, 李然, 于海峰, 等. 改进YOLOv7的水下目标检测算法[J]. 计算机工程与应用, 2024, 60(6): 89-99.
LIANG X M, LI R, YU H F, et al. Improved underwater object detection algorithm of YOLOv7[J]. Computer Engineering and Applications, 2024, 60(6): 89-99.
[11] 常戬, 陈洪福, 王冰冰. Transformer与CNN并行引导的水下图像增强[J]. 计算机工程与应用, 2024, 60(4): 280-288.
CHANG J, CHEN H F, WANG B B. Underwater image enhancement based on parallel guidance of Transformer and CNN[J]. Computer Engineering and Applications, 2024, 60(4): 280-288.
[12] LOU H, DUAN X, GUO J, et al. DC-YOLOv8: small-size object detection algorithm based on camera sensor[J]. Electronics, 2023, 12(10): 2323.
[13] CAI Y, LUAN T, GAO H, et al. YOLOv4-5D: an effective and efficient object detector for autonomous driving[J]. IEEE Transactions on Instrumentation and Measurement, 2021, 70: 1-13.
[14] WU W, LIU H, LI L, et al. Application of local fully convolutional neural network combined with YOLO v5 algorithm in small target detection of remote sensing image[J]. PloS One, 2021, 16(10): e0259283.
[15] LI X, WANG W, WU L, et al. Generalized focal loss: learning qualified and distributed bounding boxes for dense object detection[C]//Advances in Neural Information Processing Systems, 2020: 21002-21012.
[16] LIN T Y, DOLLAR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 2117-2125.
[17] LIU S, QI L, QIN H, et al. Path aggregation network for instance segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 8759-8768.
[18] SHEN Z, LIU Z, LI J, et al. Object detection from scratch with deep supervision[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 42(2): 398-412.
[19] ZHANG X, LIU C, YANG D, et al. RFAConv: innovating spatial attention and standard convolutional operation[J]. arXiv:2304.03198, 2023.
[20] LIN G, SHEN W. Research on convolutional neural network based on improved ReLU piecewise activation function[J]. Procedia Computer Science, 2018, 131: 977-984.
[21] TARG S, ALMEIDA D, LYMAN K. Resnet in resnet: generalizing residual architectures[J]. arXiv:1603.08029, 2016.
[22] CHEN J, KAO S, HE H, et al. Run, don’t walk: chasing higher FLOPS for faster neural networks[J]. arXiv:2303. 03667, 2023.
[23] DAI L, LIU H, SONG P, et al. A gated cross-domain collaborative network for underwater object detection[J]. Pattern Recognition, 2024, 149: 110222.
[24] CHOLLET F. Xception: deep learning with depthwise separable convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 1251-1258.
[25] LU W, CHEN S B, TANG J, et al. A robust feature downsampling module for remote-sensing visual tasks[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 1-12.
[26] HU S, GAO F, ZHOU X, et al. Hybrid convolutional and attention network for hyperspectral image denoising[J]. IEEE Geoscience and Remote Sensing Letters, 2024, 21: 5504005.
[27] LIU C, LI H, WANG S, et al. A dataset and benchmark of underwater object detection for robot picking[C]//Proceedings of the 2021 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), 2021: 1-6.
[28] GE Z, LIU S, WANG F, et al. YOLOx: exceeding yolo series in 2021[J]. arXiv:2107.08430, 2021.
[29] BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[J]. arXiv:2004.10934, 2020.
[30] WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 7464-7475.
[31] WANG C Y, YEH I H, LIAO H Y M. YOLOv9: learning what you want to learn using programmable gradient information[J]. arXiv:2402.13616, 2024.
[32] WANG A, CHEN H, LIU L, et al. YOLOv10: real-time end-to-end object detection[J]. arXiv:2405.14458, 2024.
[33] WEN J, CUI J, ZHAO B, et al. EnYOLO: a real-time framework for domain-adaptive underwater object detection with image enhancement[C]//Proceedings of the 2024 IEEE International Conference on Robotics and Automation (ICRA), 2024: 12613-12619.

编辑推荐 0

Metrics

阅读次数

全文

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	0	0	0	88

来源	本网站	其他网站

次数	87	1
比例	99%	1%

摘要

最新录用	在线预览	正式出版

0	0	75

	来源	本网站

	次数	75
	比例	100%