计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (5): 172-182.DOI: 10.3778/j.issn.1002-8331.2211-0344

• 图形图像处理 • 上一篇    下一篇

OMC框架下的行人多目标跟踪算法研究

贺愉婷,车进,吴金蔓,马鹏森   

  1. 1.宁夏大学 物理与电子电气工程学院,银川 750021
    2.宁夏沙漠信息智能感知重点实验室,银川 750021
  • 出版日期:2024-03-01 发布日期:2024-03-01

Research on Pedestrian Multi-Object Tracking Algorithm Under OMC Framework

HE Yuting, CHE Jin, WU Jinman, MA Pengsen   

  1. 1.School of Physics and Electronic-Engineering, Ningxia University, Yinchuan 750021, China
    2.Ningxia Key Laboratory of Intelligent Sensing for Desert Information, Yinchuan 750021, China
  • Online:2024-03-01 Published:2024-03-01

摘要: 多目标跟踪是计算机视觉领域被广泛研究的重要方向,但是在实际应用中,目标的快速移动、光照变化、遮挡等问题会导致跟踪性能变差,因此以多目标跟踪模型OMC为基础框架展开研究,以实现跟踪性能的进一步提升。针对多目标跟踪过程中存在的目标特征质量层次不齐的问题,对特征提取器进行优化,在主干网络集成了GAM注意力机制并在Neck网络部分更换了上采样方式;针对现有方法中存在的检测任务和重识别任务之间的“竞争问题”,构建了递归交叉相关网络,使得模型可以学习不同任务的特性和共性。此处针对两个子任务分别进行了优化,一是设计了新的通道注意力HS-CAM优化了重识别网络;二是更换了检测部分的边界回归损失函数,采用EIoU损失函数。实验表明,在MOT16数据集上MOTA指标可达73.5%,IDF1可达70.4%,MLgt为11.7%,相比较OMC算法减少了1.5个百分点。

关键词: 计算机视觉, 多目标跟踪, GAM注意力机制, 转置卷积, EIoU损失函数

Abstract: Multi-object tracking is an important direction that has been widely studied in the field of computer vision, but in practical applications, the rapid movement of targets, lighting changes, and occlusions can lead to poor tracking performance, therefore, the multi-object tracking model OMC is used as the basic framework to carry out research to achieve further improvement of tracking performance. Firstly, to address the problem of uneven quality of target features in multi-object tracking, the feature extractor is optimized by integrating the GAM attention mechanism in the backbone network and replacing the upsampling method in the Neck network part. Secondly, to address the “competition problem” between detection and re-identification tasks in existing methods, a recursive cross-correlation network is constructed so that the model can learn the characteristics and commonalities of different tasks. Here, two sub-tasks are optimized separately, on the one hand, a new channel attention HS-CAM is designed to optimize the re-identification network;on the other hand, the boundary regression loss function of the detection part is replaced and the EIoU loss function is adopted. Experiments show that MOTA metrics can reach 73.5%, IDF1 can reach 70.4%, and MLgt is 11.7% on MOT16 dataset, which is 1.5 percentage points reduction compared to OMC algorithm.

Key words: computer vision, multi-object tracking, GAM attention mechanism, transposed convolution, EIoU loss function