计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (8): 110-120.DOI: 10.3778/j.issn.1002-8331.2211-0468

• 模式识别与人工智能 • 上一篇    下一篇

改进Deeplabv3+的双注意力融合作物分类方法

郭金,宋廷强,孙媛媛,巩传江,刘亚林,马兴录,范海生   

  1. 1.青岛科技大学 信息科学技术学院,山东 青岛 266061
    2.青岛科技大学 大数据学院,山东 青岛 266061
    3.岭南大数据研究院 时空大数据研究室,广东 珠海 519000
  • 出版日期:2024-04-15 发布日期:2024-04-15

Improved Deeplabv3+ Crop Classification Method Based on Double Attention Fusion

GUO Jin, SONG Tingqiang, SUN Yuanyuan, GONG Chuanjiang, LIU Yalin, MA Xinglu, FAN Haisheng   

  1. 1.College of Information Science and Technology, Qingdao University of Science and Technology, Qingdao, Shandong 266061, China
    2.School of Big Data,Qingdao University of Science and Technology, Qingdao, Shandong 266061, China
    3.Spatio-temporal Big Data Research Office, Lingnan Big Data Research Institute, Zhuhai, Guangdong 519000, China
  • Online:2024-04-15 Published:2024-04-15

摘要: 近年来,卷积神经网络(convolutional neural networks,CNN)在农作物分类研究中不断取得新进展,但在建模长期依赖关系方面表现出一定的局限性,对农作物全局特征的捕获存在不足。针对以上问题,将Transformer引入Deeplab v3+模型,提出了一种用于无人机影像农作物分类的并行分支结构——DeepTrans(Deeplab v3+with Transformer)模型。DeepTrans以一种并行的方式将Transformer和CNN结合在一起,利于全局特征与局部特征的有效捕获。通过引入Transformer来增强图像中信息的远距离依赖关系,提高了作物全局信息的提取能力;加入通道注意力机制和空间注意力机制加强Transformer对通道信息的敏感度及ASPP(atrous spatial pyramid pooling)对作物空间信息捕获能力。实验结果表明,DeepTrans模型在MIoU指标上可达0.812,相较于Deeplab v3+模型提高了3.9%,该模型在五类作物的分类中精度均有提升,对于容易错分的甘蔗、玉米和香蕉三种作物,其IoU分别提高了2.9%、4.7%、13%。由此可见,DeepTrans模型在农作物分类图像的内部填充和全局预测方面有着更好的分割效果,有助于更准确地监测农田作物的种植结构及规模。

关键词: 农作物分类, 无人机影像, Deeplab v3+, Transformer, 注意力机制

Abstract: In recent years, convolutional neural networks (CNN) have made new progress in crop classification research, but they have shown some limitations in modeling long-term dependence, and there are deficiencies in capturing the global characteristics of crops. In view of the above problems, Transformer is introduced into the Deeplab v3+ model, and a parallel branch structure for crop classification of drone images, the DeepTrans (Deeplab v3+ with Transformer) model is proposed. DeepTrans combines Transformer and CNN in a parallel way, which is conducive to the effective capture of global and local features. Transformer is introduced to enhance the remote dependence of information in the image and improve the extraction ability of crop global information. Channel attention mechanism and spatial attention mechanism are added to enhance the sensitivity of Transformer to channel information and the ability of ASPP (aerospace spatial pyramid pooling) to capture crop spatial information. The experimental result shows that the MIoU index of the DeepTrans model can reach 0.812, which is 3.9% higher than that of the Deeplab v3+ model. The accuracy of the model in the classification of five crops has been improved. For sugarcane, corn and banana which are easy to be wrongly classified, their IoU has been increased by 2.9%, 4.7% and 13% respectively. It can be seen that DeepTrans model has a better segmentation effect in the internal filling and global prediction of crop classification images, which is helpful to monitor the planting structure and scale of farmland crops more timely and accurately.

Key words: crop classification, drone image, Deeplab v3+, Transformer, attention module