Lightweight YOLO-v7 for Digital Instrumentation Detection and Reading

doi:10.3778/j.issn.1002-8331.2304-0401

Abstract

Abstract: Due to the large parameter volume and high computational complexity, it is difficult to deploy generic detection and recognition models directly on mobile. To address this difficulty, a method for instrument detection and reading using computer vision on mobile devices is investigated. A lightweight meter detection network and a character detection and recognition network are redesigned based on YOLO-v7 to address the needs of detection and recognition in real industrial production environments. The depth-separable convolution is then used to further reduce the computational complexity and compress the model size. Then a K-means++ clustering algorithm plus a genetic algorithm is used to automatically generate the initial anchor box. Finally, channel pruning is used to compress the model once more. The experimental results demonstrate that the dedicated network model design, deep separable convolution and channel pruning have a significant effect on reducing the size of the model parameters and reducing the computational power requirements. The numbers of parameters are both decreased by 99.67% compared to the original YOLO-v7 model, and the model arithmetic requirements are both reduced to 0.3?GFLOPs, a decrease of 99.71%. The average image detection time in the experiments equals to 10.7?ms. The average accuracy (mAP0.5) of each network reaches 99.63% and 99.53%. The overall system reading accuracy reaches 98.44%.

Key words: digital instrumentation, YOLO-v7, depthwise separable convolution, model compression, channel pruning

摘要： 由于较大的参数体量和较高的计算复杂度，通用检测及识别模型直接在移动端部署的难度较高。为解决这个困难，研究了移动设备上使用计算机视觉的仪表检测及读数方法。针对实际工业生产环境下检测及识别的需求，基于YOLO-v7重新设计了轻量化的仪表检测网络以及字符检测及识别网络。利用深度可分离卷积进一步降低计算复杂度，压缩模型大小。采用K-means++聚类算法加遗传算法自动产生初始锚框。使用通道剪枝，再一次压缩模型。实验结果证明，专用网络模型设计、深度可分离卷积以及通道剪枝对减少模型参数体量和降低算力需求具有显著效果。参数数量相较于原始YOLO-v7模型均下降了99.67%，模型算力需求均降至0.3?GFLOPs，下降了99.71%。实验中平均图片检测时间为10.7?ms。各网络的平均精准度（mAP0.5）达到了99.63%和99.53%。系统整体读数精确度达98.44%。

关键词: 数显仪表, YOLO-v7, 深度可分离卷积, 模型压缩, 通道剪枝

ZHANG Ruining, YAN Kun, YE Jin. Lightweight YOLO-v7 for Digital Instrumentation Detection and Reading[J]. Computer Engineering and Applications, 2024, 60(8): 192-201.

章芮宁, 闫坤, 叶进. 轻量化YOLO-v7的数显仪表检测及读数[J]. 计算机工程与应用, 2024, 60(8): 192-201.

References

[1] 邓清男, 石晓龙.变电站室内数显仪表的读数识别[J].工业仪表与自动化装置, 2018(2): 86-89.
DENG Q N, SHI X L. Reading identification of indoor digital instrumentation in substation[J].Industrial Instrumentation and Automation, 2018(2): 86-89.
[2] 朱立倩.基于深度学习的数显仪表字符识别[J].计算机技术与发展, 2020, 30(6): 141-144.
ZHU L Q. Character recognition of digital display instrument based on deep learning[J].Computer Technology and Development, 2020, 30(6): 141-144.
[3] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.
[4] 翁念标.面向工业环境的数显仪表识别技术研究[D].杭州: 杭州电子科技大学, 2022.
WENG N B. Research on digital display instrument identification in industrial environment[D].Hangzhou: Hangzhou Dianzi University, 2022.
[5] WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[J].arXiv:2207.02696, 2022.
[6] 齐向明, 董旭.改进Yolov7-tiny的钢材表面缺陷检测算法[J].计算机工程与应用, 2023, 59(12): 176-183.
QI X M, DONG X. Improved Yolov7-tiny algorithm for steel surface defectdetection[J]. Computer Engineering and Applications, 2023, 59(12): 176-183.
[7] ZHU X, LYU S, WANG X, et al. TPH-YOLOv5: improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision Workshops, 2021: 2778-2788.
[8] HE K, ZHANG X, REN S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916.
[9] LU X, SONG W. Improved YOLOv5s model for vehicle detection and recognition[C]//Proceedings of the Intelligent Computing Methodologies, 2022: 423-434.
[10] SIFRE L, MALLAT S. Rigid-motion scattering for texture classification[J]. arXiv:1403.1687, 2014.
[11] HOWARD A G, ZHU M, CHEN B, et al. Mobilenets: efficient convolutional neural networks for mobile vision applications[J]. arXiv:1704.04861, 2017.
[12] 王艺霏, 贺利乐, 何林.基于YOLOv4的轻量化口罩佩戴检测模型设计[J].西北大学学报(自然科学版), 2023, 53(2): 265-273.
WANG Y F, HE L L, HE L. Design of lightweight mask wearing detection model based on YOLOv4[J]. Journal of Northwest University (Natural Science Edition), 2023, 53(2): 265-273.
[13] 耿跃, 任军号, 吉沛琦.基于K-Means变异算子的混合遗传算法聚类研究[J].计算机工程与应用, 2011, 47(29): 151-153.
GENG Y, REN J H, JI P Q. Hybrid genetic algorithm clustering analysis based on K-Means mutation operator[J]. Computer Engineering and Applications, 2011, 47(29): 151-153.
[14] 吴志高, 陈明.基于改进YOLO v7的微藻轻量级检测方法[J].大连海洋大学学报, 2023, 38(1): 129-139.
WU Z G, CHEN M. Lightweight detection method for microalgae based on improved YOLO v7[J]. Journal of Dalian Ocean University, 2023, 38(1): 129-139.
[15] LIU Z, LI J, SHEN Z, et al. Learning efficient convolutional networks through network slimming[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017.
[16] 黄文斌, 陈仁文, 袁婷婷.改进YOLOv3-SPP的无人机目标检测模型压缩方案[J].计算机工程与应用, 2021, 57(21): 165-173.
HUANG W B, CHEN R W, YUAN T T. Compression of UAV object detection model based on improved YOLOv3-SPP[J]. Computer Engineering and Applications, 2021, 57(21): 165-173.
[17] 杨国威, 许志旺, 房臣，等.融合剪枝与量化的目标检测网络压缩方法[J].计算机工程与应用, 2022, 58(22): 108-115.
YANG G W, XU Z W, FANG C, et al. Object detection network compression method based on pruning and quantization[J].Computer Engineering and Applications, 2022, 58(22): 108-115.
[18] 陈科峻, 张叶.基于YOLO-v3模型压缩的卫星图像船只实时检测[J].液晶与显示, 2020, 35(11): 1168-1176.
CHEN K J, ZHANG Y. Real-time ship detection in satellite images based on YOLO-v3 model compression[J]. Liquid Crystals and Displays, 2020, 35(11): 1168-1176.
[19] ZHAO H L, SHI K J, JIN X G, et al. Probability-based channel pruning for depthwise separable convolutional networks[J]. Journal of Computer Science and Technology, 2022, 37(3): 584-600.