Optimizing Human Abnormal Behavior Detection Method of YOLO Network

doi:10.3778/j.issn.1002-8331.2208-0061

Abstract

Abstract: Because of the large interference of environmental background information in public surveillance videos and the different scale of abnormal human behavior goals, at present, it is difficult to improve the precision of human abnormal behavior detection. For the above issues, this paper designs the abnormal behavior detection method by improving the YOLOv5 module. In this method, a shielded convolutional attention model is added to the original YOLOv5 backbone network. The module starts from a shielded convolutional layer, and the central region of the receptive field is covered. The shielding information is predicted and the errors related to the shielding information are used as abnormal scores. At the same time, Swin-CA module is embedded in the detection network. Through the study of characteristics of adjacent layers, enables the module to get stronger grasp the overall situation information, thus reducing the affect of backdrop message on the detection results, by extracting the scale characteristics of human behavior abnormalities in different backgrounds, it decreases the order of complex of the whole model calculation and improves the precision of the module to locate the target of abnormal human behavior. Experimental results on the UCSD-PED1, KTH and Shanghai Tech datasets show that the precision of the proposed method reaches 98.2%, 96.4% and 95.8%, respectively.

Key words: abnormal human behavior, YOLOv5, mask convolution, attentional mechanism, Swin-CA module

摘要： 鉴于公共场合监测视频信息中周围环境背景信息干扰大以及人体异常行为目标的尺度不同，目前人体异常行为检测的准确性难以进一步提高。针对上述问题，设计了通过改进YOLOv5网络的异常行为检测方法。该方法在原YOLOv5主干网络添加屏蔽卷积注意力模型，该模块从一个屏蔽卷积层开始，感受野的中心区域被遮掩，通过预测屏蔽信息并利用与屏蔽信息相关的误差作为异常得分。在检测网络中嵌入Swin-CA模块。通过对相邻层特征的学习，使得模型能够更好地掌握全局信息，从而减小了背景信息对检测结果的影响，通过提取不同背景中人体异常行为尺度特征，降低了整个模型计算的复杂度，提高了模型对人体异常行为目标定位的精度。在UCSD-ped1、KTH和Shanghai Tech数据集上的实验结果表明，提出方法的检测精度分别达到了98.2%、96.4%和95.8%。

关键词: 人体异常行为, YOLOv5, 屏蔽卷积, 注意力机制, Swin-CA模块

ZHANG Hongmin, ZHAUNG Xu, ZHENG Jingtian, FANG Xiaobing. Optimizing Human Abnormal Behavior Detection Method of YOLO Network[J]. Computer Engineering and Applications, 2023, 59(7): 242-249.

张红民, 庄旭, 郑敬添, 房晓冰. 优化YOLO网络的人体异常行为检测方法[J]. 计算机工程与应用, 2023, 59(7): 242-249.

References

[1] LENTZAS A，VRAKAS D.Non-intrusive human activity recognition and abnormal behavior detection on elderly people：a review[J].Artificial Intelligence Review，2020，53（3）：1975-2021.
[2] ZHANG X P，JI J H，WANG L，et al.Review of video based human abnormal behavior recognition and detection[J].Control and Decision，2021（1）：1-14.
[3] FAN Z，YIN J，SONG Y，et al.Real-time and accurate abnormal behavior detection in videos[J].Machine Vision and Applications，2020，31（7）：1-13.
[4] DONG G，LIU L Q，LE V，et al.Memorizing normality to detect anomaly：memory-augmented deep autoencoder for unsupervised anomaly detection[C]//Proceedings of International Conference on Computer Vision，2019：1705-1714.
[5] HASAN M，CHOI J，NEUMANN J，et al.Learning temporal regularity in video sequences[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition（CVPR），2016.
[6] CARRERA D，MANGANINI F，BORACCHI G，et al.Defect detection in nanostructures[J].IEEE Transactions on Industrial Informatics，2017，99：1.
[7] CHENG K W，CHEN Y T，FANG W H.Video anomaly detection and localization using hierarchical feature representation and Gaussian process regression[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition（CVPR），2015.
[8] LIN T Y，DOLLAR P，GIRSHICK R，et al.Feature pyramid networks for object detection[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition（CVPR），2017.
[9] HAN C L，PENG F X，JIE A，et al.Pyramid attention network for semantic segmentation[J].arXiv：1805.10180，2018.
[10] GUO S，ZHONG P，SUN Y，et al.Fast detection algorithm for surface defects of metal parts based on YOLOv4-mobilenet network[C]//Proceedings of Society of Photo-Optical Instrumentation Engineers（SPIE），2021.
[11] LIU R，WANG H，ZHANG S，et al.Object detection algorithm based on improved YOLOv5 for basketball robot[C]//Proceedings of Chinese Intelligent Systems Conference，2022.
[12] RISTEA N C，MADAN N，IONESCU R T，et al.Self-supervised predictive convolutional attentive block for anomaly detection[J].arXiv：2111.09099，2021.
[13] LIU Z，LIN Y，CAO Y，et al.Swin transformer：hierarchical vision transformer using shifted windows[C]//Proceedings of International Conference on Computer Vision（ICCV），2021：9992-10002.
[14] HOU Q，ZHOU D，FENG J.Coordinate attention for efficient mobile network design[C]//Proceedings of International Conference on Computer Vision and Pattern Recognition（CVPR），2021：13708-13717.
[15] HU J，SHEN L，ALBANIE S，et al.Squeeze-and-excitation networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2020，42（8）：2011-2023.
[16] MAHADEVAN V，LI W X，BHALODIA V，et al.Anomaly detection in crowded scenes[C]//Proceedings of Computer Vision & Pattern Recognition，2021.
[17] SCHULDT C，LAPTEV I，CAPUTO B.Recognizing human actions：a local SVM approach[C]//Proceedings of International Conference on Pattern Recognition，2004：32-36.
[18] LUO W，WEN L，GAO S.A revisit of sparse coding based anomaly detection in stacked RNN framework[C]//Proceedings of 2017 IEEE International Conference on Computer Vision（ICCV），2017：341-349.
[19] 张富凯，杨峰，李策.基于改进YOLOv3的快速车辆检测方法[J].计算机工程与应用，2019，55（2）：12-20.
ZHANG F K，YANG F，LI C.Fast vehicle detection method based on improved YOLOv3[J].Computer Engineering and Applications，2019，55（2）：12-20.
[20] 张红民，李萍萍，房晓冰，等.改进YOLOv3网络模型的人体异常行为检测方法[J].计算机科学，2022（4）：233-238.
ZHANG H M，LI P P，FANG X B，et al.Human abnormal behavior detection method based on improved YOLOv3 network model[J].Computer Science，2022（4）：233-238.
[21] 徐印赟，江明，李云飞，等.基于改进YOLO及NMS的水果目标检测[J].电子测量与仪器学报，2022，36（4）：114-123.
XU Y Y，JIANG M，LI Y F，et al.Fruit target detection based on improved YOLO and NMS[J].Journal of Electronic Measurement and Instrumentation，2022，36（4）：114-123.
[22] MARTIN P E，BENOIS-PINEAU J，R PéTERI，et al.3D convolutional networks for action recognition：application to sport gesture recognition[J].arXiv：2204.08460，2022.
[23] RAVANBAKHSH M，NABI M，SANGINETO E，et al.Abnormal event detection in videos using generative adversarial nets[C]//Proceedings of 2017 IEEE International Conference on Image Processing（ICIP），2017：1577-1581.
[24] DONG F，ZHANG Y，NIE X S.Dual discriminator generative adversarial network for video anomaly detection[J].IEEE Access，2020，8：88170-88176.
[25] LU Y W，YU F，REDDY M K K，et al.Few-shot scene-adaptive anomaly detection[C]//Proceedings of European Conference on Computer Vision，2020：125-141.