计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (8): 215-225.DOI: 10.3778/j.issn.1002-8331.2312-0103

• 图形图像处理 • 上一篇    下一篇

UBA-OWDT:一种新型的开放世界目标检测网络

谢斌红,唐彪,张睿   

  1. 太原科技大学 计算机科学与技术学院,太原 030024
  • 出版日期:2025-04-15 发布日期:2025-04-15

UBA-OWDT: Novel Network of Open World Object Detection

XIE Binhong, TANG Biao, ZHANG Rui   

  1. College of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan 030024, China
  • Online:2025-04-15 Published:2025-04-15

摘要: 开放世界目标检测(open world object detection,OWOD)的主要任务是检测已知类别和识别未知目标。此外,模型在下一个训练阶段中逐步学习已知新类。针对OW-DETR(open-world detection transformer)中未知类召回率偏低、密集目标与小目标漏检等问题,提出了一种UBA-OWDT(UCSO,BiStrip and AFDF of open-world detection transformer)开放世界目标检测网络。针对未知类召回率偏低的问题,对未知类评分优化(unknown class scoring optimization,UCSO),将生成的浅层类激活图与聚合类激活图融合,获取细粒度特征信息,提高未知类的目标评分,进而提升未知类的召回率;针对小目标漏检问题,将双条状注意力(spatial attention based on strip pooling and strip convolution,BiStrip)应用于输入特征图,捕获长程依赖,保留目标精确的位置信息,增强感兴趣目标的表征,以检测小目标;针对密集目标漏检问题,采用自适应特征动态融合(adaptive feature dynamic fusion,AFDF),根据目标大小和形状,获得不同的感受野,动态分配注意力权重,关注目标的重要部分,融合不同层级的特征,以检测密集目标。在OWOD数据集的实验结果表明,未知类召回率增值范围在0.7~1.5个百分点, mAP增值范围在0.6~1.2个百分点,与现有的开放世界目标检测方法相比, 在召回率偏低、密集目标与小目标漏检问题上具有一定的优势。

关键词: 开放世界目标检测, 自适应特征动态融合, 未知类评分优化, 注意力机制

Abstract: The primary task of open world object detection (OWOD) is to detect known classes and identify unknown objects. In addition, the model incrementally learns novel known classes in the subsequent training phase. To address issues in OW-DETR (open-world detection transformer), such as low recall for unknown classes and missing dense and small objects, a novel open world detection network called UBA-OWDT (UCSO, BiStrip, and AFDF of open-world detection transformer) is proposed. To tackle the problem of low recall for unknown classes, an unknown class scoring optimization (UCSO) is introduced. This approach fuses generated shallow class activation maps with aggregated class activation maps to capture fine-grained feature information, thereby enhancing the object scores for unknown classes and improving their recall rate. To address the issue of missing small objects, a double strip attention mechanism known as BiStrip is applied to input feature maps. BiStrip captures long-range dependencies, preserves precise object position information, and enhances the representation of small objects for detection. To mitigate the problem of missing dense objects, an adaptive feature dynamic fusion (AFDF) approach is adopted. AFDF dynamically allocates attention weights based on object size and shape, obtaining different receptive fields, and focuses on critical parts of the object. It also fuses features from different levels to detect dense objects. Experimental results on the OWOD dataset show that the improved recall rate for unknown classes ranges from 0.7 to 1.5 percentage points, and the mean average precision (mAP) improvement falls within the range of 0.6 to 1.2?percentage points. Compared to existing methods for open world object detection, the proposed UBA-OWDT exhibits certain advantages in addressing issues related to low recall, missing dense objects, and small object detection.

Key words: open world object detection (OWOD), adaptive feature dynamic fusion (AFDF), unknown class scoring optimization (UCSO), attention mechanism