Multimodal Pedestrian Detection Algorithm Based on Fusion Feature Pyramids

doi:10.3778/j.issn.1002-8331.1812-0352

Abstract

Abstract: To solve the problems of poor pedestrian detection performance in a single modal due to poor lighting conditions, partial target occlusion and multi-scale target, this paper proposes a multimodal pedestrian detection algorithm based on the fusion of visible and infrared feature pyramids. It uses the deep convolutional neural networks to replace the traditional manual design features, and automatically extracts the features from visible and infrared images. According to the periodic feature maps of ResNet（Residual Net）, a feature pyramid network is built to generate the feature pyramid of each mode. The feature pyramids of each modal are fused layer by layer to create the fusion feature pyramid. It chooses the faster R-CNN algorithm do the following target location and classification algorithm to solve the multispectral pedestrian detection. In addition, in order to solve the problem of ignoring weak features and not effectively integrating complementary features in concatenation fusion and max fusion, the paper proposes a new feature pyramid fusion method. It highlights the strong features and complements the weak features by threshold, effectively utilizes the features of each mode. The multimodal pedestrian detection algorithm based on the fusion of visible and infrared feature pyramids can effectively solve the multimodal pedestrian detection problem, and outperforms state-of-art multimodal pedestrian detectors on the KAIST dataset benchmark.

Key words: pedestrian detection, multimodal, feature pyramid, feature fusion

摘要： 针对单模态行人检测在光照条件较差、目标部分遮挡、目标多尺度时检测效果较差的问题，提出了一种基于可见和红外双模态特征金字塔融合的行人检测算法。使用深度卷积神经网络代替传统的手工设计特征方式分别自动从可见模态及红外热模态的图片中提取单模态特征，根据ResNet（Residual Net）的阶段性特征图谱搭建特征金字塔网络，生成每个模态的特征金字塔，并将两个模态的特征金字塔进行逐层融合。选择深度学习通用目标检测算法——Faster R-CNN作为后续的目标定位与分类算法来解决多模态行人检测问题。在特征金字塔融合阶段，针对级联融合和较大值融合容易忽略弱特征，无法有效融合互补特征的问题，提出了一种锐化特征的特征金字塔融合方法，根据阈值强化突出强特征，互补叠加弱特征，有效利用每个模态的特征，进一步提高模型的检测效果。实验结果表明，特征金字塔聚合的多模态行人检测算法可以有效解决多模态行人检测问题，在KAIST数据集上的检测效果超过了目前该数据集上的最佳模型。

关键词: 行人检测, 多模态, 特征金字塔, 特征融合

TONG Jingran, MAO Li, SUN Jun. Multimodal Pedestrian Detection Algorithm Based on Fusion Feature Pyramids[J]. Computer Engineering and Applications, 2019, 55(19): 214-222.

童靖然，毛力，孙俊. 特征金字塔融合的多模态行人检测算法[J]. 计算机工程与应用, 2019, 55(19): 214-222.

[1]	LU Lixia, ZOU Junzhong, GUO Yucheng, ZHANG Jian, WANG Bei. Prediction of Knee Injury Based on Multimodal Fusion [J]. Computer Engineering and Applications, 2021, 57(9): 225-232.
[2]	DONG Xubin, ZHAO Qinghua. Research and Application of Improved Mask R-CNN in Aerial Image Target Detection [J]. Computer Engineering and Applications, 2021, 57(8): 133-144.
[3]	LI Mingshan, HAN Qingpeng, ZHANG Tianyu, WANG Daolei. Safety Helmet Detection Method of Improved SSD [J]. Computer Engineering and Applications, 2021, 57(8): 192-197.
[4]	GUO Xiaojing, SUI Haoda. Application of Improved YOLOv3 in Foreign Object Debris Target Detection on Airfield Pavement [J]. Computer Engineering and Applications, 2021, 57(8): 249-255.
[5]	SHEN Xinfeng, JIANG Ping, ZHOU Genrong. Application of Improved SSD Algorithm in Parts Detection [J]. Computer Engineering and Applications, 2021, 57(7): 257-262.
[6]	TANG Guozhi, LI Dinggen. Research on Pedestrian Tracking Algorithms with Deep Learning and Space-Time Constraints [J]. Computer Engineering and Applications, 2021, 57(7): 121-129.
[7]	JIN Wang, YI Guohong, HONG Hanyu, CHEN Siyuan. Real-Time Vehicle Detection Based on Convolutional Neural Network [J]. Computer Engineering and Applications, 2021, 57(5): 222-228.
[8]	HAN Wenjing, LUO Xiaoshu, YANG Rixing. Research on Compound Gesture Recognition Method [J]. Computer Engineering and Applications, 2021, 57(4): 108-113.
[9]	ZHAO Hui, LI Zhiwei, FANG Lufa. Feature Information Enhancement Based Single Shot Multibox Detector Algorithm [J]. Computer Engineering and Applications, 2021, 57(4): 148-154.
[10]	LI Zuolong, WANG Banghai, LU Zeng. Pedestrian Detection Method Based on Multi-scale Feature Fusion and Reconstruction [J]. Computer Engineering and Applications, 2021, 57(4): 176-182.
[11]	WANG Dianwei, ZHAO Mengying, LIU Ying, SONG Haijun, XIE Yongjun. Improved R-SSD Panoramic Video Image Vehicle Detection Algorithm [J]. Computer Engineering and Applications, 2021, 57(3): 189-195.
[12]	LU Wei, LIU Dan, SHAO Min, WU Yangdong. Application of Improved Mask R-CNN Network in Medical Image Recognition and Segmentation [J]. Computer Engineering and Applications, 2021, 57(24): 234-241.
[13]	XIAO Ruixue, FENG Yingwei, QU Jianping. Steganalysis of Variable Size Image Based on Efficient Feature Fusion [J]. Computer Engineering and Applications, 2021, 57(24): 126-134.
[14]	YANG Siqi, YI Yaohua, TANG Ziwei, WANG Xinyu. Text Detection in Natural Scenes Embedded Attention Mechanism [J]. Computer Engineering and Applications, 2021, 57(24): 185-191.
[15]	XU Xiaodong, WANG Junjie. Real-Time Construction Worker Detection Method for Edge Device [J]. Computer Engineering and Applications, 2021, 57(23): 280-286.

Multimodal Pedestrian Detection Algorithm Based on Fusion Feature Pyramids

特征金字塔融合的多模态行人检测算法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics