Computer Engineering and Applications ›› 2024, Vol. 60 ›› Issue (22): 145-153.DOI: 10.3778/j.issn.1002-8331.2307-0270

• Pattern Recognition and Artificial Intelligence • Previous Articles     Next Articles

Dual-Branch Feature Fusion Remote Sensing Building Detection Model

CHENG Jiawei, GUO Rongzuo, WU Jiancheng, ZHANG Hao   

  1. College of Computer Science, Sichuan Normal University, Chengdu 610101, China
  • Online:2024-11-15 Published:2024-11-14

双分支特征融合的遥感建筑物检测模型

成嘉伟,郭荣佐,吴建成,张浩   

  1. 四川师范大学 计算机科学学院,成都 610101

Abstract: In order to solve the problem of low accuracy caused by different building sizes and fuzzy edges in remote sensing building images, a dual-branch parallel fusion attention mechanism network model TC-UNet++ is proposed. Firstly, considering that convolutional neural networks are good at extracting local features and difficult to capture global information, Transformer structure is introduced to solve the problem of global information loss. Secondly, to solve the problem of mismatch between the feature dimension and channel number of the two structures, a TC (Transformer to CNN) module is designed to interactively integrate local and global features at different resolutions. Finally, the coordinate attention mechanism is introduced to locate and identify buildings according to the position information of pixels in the image. Experimental results show that the interaction ratio, accuracy and total accuracy of TC-UNet++ on WHU dataset reach 93.1%, 95.9% and 98.8% respectively, showing good effectiveness without significantly increasing parameters.

Key words: TC-UNet++, remote sensing building images, dual-branch, coordinate attention mechanism, feature fusion

摘要: 针对遥感建筑物图像中建筑物大小不一、边缘模糊导致精度不高的问题,提出一种双分支并行融合注意力机制的网络模型TC-UNet++。针对卷积神经网络擅长提取局部特征,难以捕获全局信息的特点,引入Transformer结构以解决全局信息丢失的问题。对于两种结构的特征维度和通道数不匹配的问题,设计一种TC(Transformer to CNN)模块以交互的方式融合不同分辨率下局部与全局特征。引入坐标注意力机制,根据像素在图像中的位置信息,定位和识别建筑物。实验结果表明,TC-UNet++在WHU数据集上交互比、准确率、总精度分别达到了93.1%、95.9%、98.8%,在不显著增加参数的情况下,展现出良好的有效性。

关键词: TC-UNet++, 遥感建筑物图像, 双分支, 坐标注意力机制, 特征融合