Target-IoU Loss：Foreground-Aware Regression Loss with Asymmetric Strategy

doi:10.3778/j.issn.1002-8331.2202-0005

Abstract

Abstract: The regression loss function is one of the important components in the object detection networks. In the existing regression loss, whether the L-norm loss or the IoU-based loss, a symmetrical strategy is used to process the two bounding boxes of the input, which makes their use of foreground information insufficient and affects the quality of the regression. To this end, this paper proposes an asymmetric strategy to enhance the role of foreground information in the regression loss, under the guidance of this strategy, a TIoU（Target-IoU） loss is designed to ensure that the network has a full use of the characteristics in the ground-truth, makes the regression of bounding boxes closer to the real value. Experimental results show that the accuracy of TIoU loss is improved by 0.2 percentage points and 0.5?percentage points under the frameworks of Faster R-CNN and RetinaNet respectively, the data set used in the experiments is PASCAL VOC.

Key words: object detection, regression loss, foreground information, deep learning

摘要： 回归损失函数是目标检测网络的重要组成部分之一。现有的回归损失中，无论是L范式损失还是基于IoU的损失都采用一种对称策略处理输入的两个边界框，这使得它们对前景信息的利用不够充分，影响了回归的质量。为此，提出了一种非对称策略，用以增强前景信息在回归损失中的作用，并在该策略的指导下设计了TIoU（Target-IoU）损失来保证网络对真值框内的特征予以有效利用，使得边界框的回归更贴近真实值。实验结果表明，TIoU损失在Faster R-CNN和RetinaNet下精度分别提升了0.2个百分点和0.5个百分点，实验数据集采用的是PASCAL VOC数据集。

关键词: 目标检测, 回归损失, 前景信息, 深度学习

SHAO Rong, CHEN Dongfang, WANG Xiaofeng. Target-IoU Loss：Foreground-Aware Regression Loss with Asymmetric Strategy[J]. Computer Engineering and Applications, 2023, 59(11): 112-118.

邵容, 陈东方, 王晓峰. 非对称策略下基于前景信息的TIoU回归损失计算[J]. 计算机工程与应用, 2023, 59(11): 112-118.

References

[1] QIN Z，LI Z，ZHANG Z，et al.ThunderNet：towards real-time generic object detection on mobile devices[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision，2019：6718-6727.
[2] REN S，HE K，GIRSHICK R，et al.Faster R-CNN：towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2016，39（6）：1137-1149.
[3] LIN T Y，GOYAL P，GIRSHICK R，et al.Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision，2017：2980-2988.
[4] CAO Y，CHEN K，LOY C C，et al.Prime sample attention in object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2020：11583-11591.
[5] ZHANG H，WANG Y，DAYOUB F，et al.VarifocalNet：an IoU-aware dense object detector[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2021：8514-8523.
[6] ZHANG S，CHI C，YAO Y，et al.Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2020：9759-9768.
[7] YU J，JIANG Y，WANG Z，et al.Unitbox：an advanced object detection network[C]//Proceedings of the 24th ACM International Conference on Multimedia，2016：516-520.
[8] REZATOFIGHI H，TSOI N，GWAK J Y，et al.Generalized intersection over union：a metric and a loss for bounding box regression[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2019：658-666.
[9] GIRSHICK R，DONAHUE J，DARRELL T，et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2014：580-587.
[10] GIRSHICK R.Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision，2015：1440-1448.
[11] CAI Z，VASCONCELOS N.Cascade R-CNN：delving into high quality object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2018：6154-6162.
[12] HE K，GKIOXARI G，DOLLáR P，et al.Mask R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision，2017：2961-2969.
[13] REDMON J，DIVVALA S，GIRSHICK R，et al.You only look once：unified，real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2016：779-788.
[14] REDMON J，FARHADI A.YOLO9000：better，faster，stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition，2017：7263-7271.
[15] REDMON J，FARHADI A.Yolov3：an incremental improve- ment[J].arXiv：1804.02767，2018.
[16] LIU W，ANGUELOV D，ERHAN D，et al.SSD：single shot multibox detector[C]//European Conference on Computer Vision.Cham：Springer，2016：21-37.
[17] ZHENG Z，WANG P，LIU W，et al.Distance-IoU loss：faster and better learning for bounding box regression[C]//Proceedings of the AAAI Conference on Artificial Intelligence，2020：12993-13000.
[18] CHEN Z，CHEN K，LIN W，et al.PIoU loss：towards accurate oriented object detection in complex environments[C]//European Conference on Computer Vision.Cham：Springer，2020：195-211.
[19] PANG J，CHEN K，SHI J，et al.Libra R-CNN：towards balanced learning for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition，2019：821-830.
[20] RIDNIK T，BEN-BARUCH E，ZAMIR N，et al.Asymmetric loss for multi-label classification[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision，2021：82-91.
[21] CHEN K，WANG J，PANG J，et al.MMDetection：open MMLab detection toolbox and benchmark[J].arXiv：1906. 07155，2019.