计算机工程与应用 ›› 2020, Vol. 56 ›› Issue (7): 221-227.DOI: 10.3778/j.issn.1002-8331.1901-0087

• 图形图像处理 • 上一篇    下一篇

融合多特征图的野生动物视频目标检测方法

陈建促,王越,朱小飞,李章宇,林志航   

  1. 重庆理工大学 计算机科学与工程学院,重庆 400054
  • 出版日期:2020-04-01 发布日期:2020-03-28

Wild Animal Video Object Detection Method Combining Multi-feature Map

CHEN Jiancu, WANG Yue, ZHU Xiaofei, LI Zhangyu, LIN Zhihang   

  1. Chongqing University of Technology, School of Computer Science and Engineering, Chongqing 400054, China
  • Online:2020-04-01 Published:2020-03-28

摘要:

针对YOLOv3在野生动物视频目标检测领域中,存在的前后视频帧同区域关系难以描述的缺点,提出了Context-aware YOLO模型。该模型使用互信息熵对相邻帧的图像相似度进行量化,根据量化结果拟合出帧融合的相关因子,并使用相关因子对视频前后帧的特征图进行线性迭代融合;引入直方图均衡计算相似度的方法,判断“镜头切换”的情况,以确定特征图融合的临界条件。实验结果表明,Context-aware YOLO模型相对于YOLOv3模型F1值提升了2.4%,平均准确率(mAP)提升了4.71%。

关键词: YOLOv3模型, 视频目标检测, 互信息熵, 线性迭代, 直方图均衡

Abstract:

Aiming at the disadvantage of YOLOv3 in the field of wildlife video target detection,  it is difficult to describe the relationship between the front and back video frames and the region, the Context-aware YOLO model is proposed. The model uses mutual information entropy to quantize the image similarity of adjacent frames, fits the correlation factor of frame fusion according to the quantization result, and uses the correlation factor to linearly iterate the feature map of the video before and after the frame; the histogram equalization method is introduced to calculate the similarity and judge the situation of “shot switching” to determine the critical condition of feature map fusion. The experimental results show that the Context-aware YOLO model has an increase of  2.4% over the F1 value of the YOLOv3 model, and the average accuracy(mAP) has increased by 4.71%.

Key words: YOLOv3 model, video object detection, mutual information entropy, linear iteration, histogram equalization