计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (2): 244-253.DOI: 10.3778/j.issn.1002-8331.2208-0312

• 图形图像处理 • 上一篇    下一篇

引入轻量级Transformer的无人机视觉跟踪

谌海云,王海川,黄忠义,余鸿皓   

  1. 西南石油大学 电气信息学院,成都 610500
  • 出版日期:2024-01-15 发布日期:2024-01-15

UAV Visual Tracking with Lightweight Transformer

SHEN Haiyun, WANG Haichuan, HUANG Zhongyi, YU Honghao   

  1. School of Electrical Information, Southwest Petroleum University, Chengdu 610500, China
  • Online:2024-01-15 Published:2024-01-15

摘要: 随着无人机在军事和民用领域的广泛运用,对于高精度、低功耗智能无人机跟踪系统的需求日益增加。针对目标跟踪算法在无人机跟踪场景下很难平衡跟踪精度和跟踪速度的问题,提出一种引入轻量级Transformer的孪生网络无人机目标跟踪算法SiamLT。使用Transformer对AlexNet网络进行改进,在增加最小计算量的情况下捕获全局特征信息。在目标模板与搜索区域匹配方面,联合Transformer和深度互相关运算提出一种二元相关模块,同时捕获目标模板与搜索区域之间的局部相关性和全局依赖关系。在分类回归网络中引入距离交并比,并采用多监督策略训练网络,以获取更准确的目标位置。在UAV123和UAV20L跟踪基准上的实验结果表明,SiamLT算法优于主流的目标跟踪算法,更有效地平衡了跟踪精度和跟踪速度。

关键词: 无人机, 目标跟踪, Transformer, 孪生网络, 多头注意力

Abstract: As UAV is widely used in military and civilian fields, the demand for high-precision, low-power intelligent UAV tracking systems gradually increases. Focusing on the problem that the target tracking algorithm is difficult to balance the tracking accuracy and tracking speed in the UAV tracking scene, a Siamese network UAV target tracking algorithm is proposed to introduce a lightweight Transformer, named SiamLT. The AlexNet network is improved using Transformer to capture global feature information while increasing the minimum computational effort. In terms of feature map matching, a binary correlation module is proposed by combining Transformer and deep cross-correlation operation, which simultaneously captures local and global dependencies between target templates and search regions. The distance intersection ratio is introduced into the classification and regression network, and a multi-supervised strategy is used to train the network to obtain more accurate target locations. Experimental results on the UAV123 and UAV20L tracking benchmarks show that SiamLT algorithm outperforms the mainstream target tracking algorithms, which balances tracking accuracy and tracking speed more effectively.

Key words: unmanned aerial vehicle (UAV), object tracking, Transformer, Siamese network, multi-head attention