Computer Engineering and Applications ›› 2020, Vol. 56 ›› Issue (7): 221-227.DOI: 10.3778/j.issn.1002-8331.1901-0087

Previous Articles     Next Articles

Wild Animal Video Object Detection Method Combining Multi-feature Map

CHEN Jiancu, WANG Yue, ZHU Xiaofei, LI Zhangyu, LIN Zhihang   

  1. Chongqing University of Technology, School of Computer Science and Engineering, Chongqing 400054, China
  • Online:2020-04-01 Published:2020-03-28



  1. 重庆理工大学 计算机科学与工程学院,重庆 400054


Aiming at the disadvantage of YOLOv3 in the field of wildlife video target detection,  it is difficult to describe the relationship between the front and back video frames and the region, the Context-aware YOLO model is proposed. The model uses mutual information entropy to quantize the image similarity of adjacent frames, fits the correlation factor of frame fusion according to the quantization result, and uses the correlation factor to linearly iterate the feature map of the video before and after the frame; the histogram equalization method is introduced to calculate the similarity and judge the situation of “shot switching” to determine the critical condition of feature map fusion. The experimental results show that the Context-aware YOLO model has an increase of  2.4% over the F1 value of the YOLOv3 model, and the average accuracy(mAP) has increased by 4.71%.

Key words: YOLOv3 model, video object detection, mutual information entropy, linear iteration, histogram equalization


针对YOLOv3在野生动物视频目标检测领域中,存在的前后视频帧同区域关系难以描述的缺点,提出了Context-aware YOLO模型。该模型使用互信息熵对相邻帧的图像相似度进行量化,根据量化结果拟合出帧融合的相关因子,并使用相关因子对视频前后帧的特征图进行线性迭代融合;引入直方图均衡计算相似度的方法,判断“镜头切换”的情况,以确定特征图融合的临界条件。实验结果表明,Context-aware YOLO模型相对于YOLOv3模型F1值提升了2.4%,平均准确率(mAP)提升了4.71%。

关键词: YOLOv3模型, 视频目标检测, 互信息熵, 线性迭代, 直方图均衡