计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (16): 206-216.DOI: 10.3778/j.issn.1002-8331.2305-0318

• 图形图像处理 • 上一篇    下一篇

超越单一感知的农田害虫检测算法MRA-YOLOX

王中天,邹颖波,吴昌霖,李新   

  1. 1.桂林理工大学 信息科学与工程学院,广西 桂林 541006
    2.广西嵌入式技术与智能系统重点实验室,广西 桂林 541006
  • 出版日期:2024-08-15 发布日期:2024-08-15

MRA-YOLOX for Pest Detection in Farmland Beyond Single Perception

WANG Zhongtian, ZOU Yingbo, WU Changlin, LI Xin   

  1. 1.School of Information Science and Engineering, Guilin University of Technology, Guilin, Guangxi 541006, China
    2.Guangxi Key Laboratory of Embedded Technology and Intelligent System, Guilin, Guangxi 541006, China
  • Online:2024-08-15 Published:2024-08-15

摘要: 目标检测技术正逐步应用于农业,然而在农田害虫检测的运用中仍存在检测速度慢、检测准确率偏低的问题,且仅仅预测害虫的种类和位置信息不足以满足复杂的工程需求。提出一种可以额外预测害虫状态信息的融合MAE和YOLOX算法的高速高精度农田害虫检测模型MRA-YOLOX(masked autoencoders and rapid aim detection-exceeding YOLO)。算法构建包含近4万张图片以及5万余标注的数据集TDBFP(target detection dataset be used for farmland pests),TDBFP数据集标注了10种害虫的生长状态、物种类别以及位置,以便更好地把握害虫信息,从而更准确地制定对策。修改YOLOX模型的解耦头及loss,额外输出生长状态,以改进模型预测更多信息;将ECA(efficient channel attention)和SA(shuffle attention)注意力机制进行有机融合,并插入backbone与FPN(feature pyramid networks)的连接过程以及FPN的通道堆叠过程,以便能够增强获得全局信息和丰富上下文信息的能力,从而取得比单一注意力机制更好的效果;将MAE中自监督解码器部分插入YOLOX的数据增强部分,扩大感受野,增强识别细粒度,获得超越mixup和mosaic的数据增强效果。实验结果表明,当需要同时感知目标的状态、分类和位置时,MRA-YOLOX相较于原始YOLOX模型对于TDBFP数据集的检测精度mAP@0.5由60.1%上升到88.2%,平均检测准确率提高了18.8个百分点,且检测帧率达到145?FPS,可以用于更复杂的工程实践。

关键词: 掩码自编码器(MAE), 注意力机制, YOLOX, 害虫识别, 状态检测

Abstract: At present, target detection technology is gradually applied in agriculture, but there are still problems in the application of farmland pest detection, such as slow detection speed and low detection accuracy, and only predicting the type and location information of pests is not enough to meet the complex engineering needs. In this paper, a high speed and high precision farmland pest detection model MRA-YOLOX (masked autoencoders and rapid aim detection-exceeding YOLO), which can be used to predict additional pest status information, is proposed by fusing MAE and YOLOX algorithm. By constructing nearly 40?000 pictures and more than 50?000 labels dataset TDBFP (target detection dataset be used for farmland pests), the TDBFP dataset labels the growth status, species category and location of 10 kinds of pests, so as to better grasp the information of pests and develop more accurate countermeasures. Firstly, the decoupling head and loss of YOLOX model are modified to output additional growth states to improve model prediction. Secondly, ECA and SA attention mechanisms are organically integrated, and the connection process between backbone and FPN and the channel stacking process of FPN are inserted, so as to enhance the ability to obtain global information and enrich context information and achieve better results than a single attention mechanism. Finally, the self-supervised decoder part of MAE is inserted into the data enhancement part of YOLOX in order to expand the receptive field, enhance the recognition granularity, and obtain the data enhancement effect beyond mixup and mosaic. Experimental results show that when it is necessary to perceive the state, classification and position of the target at the same time, compared with the original YOLOX model, the detection accuracy of MLA-YOLOX for TDBFP dataset mAP@0.5 increases from 60.1% to 88.2%, the average detection accuracy increases by 18.8 percentage points, and the detection frame rate reaches 145?FPS, MLA-YOLOX can be used for more complex engineering.

Key words: masked autoencoders (MAE), attention mechanism, YOLOX, pest detection, state detection