LIU Zhao, YANG Fan, SI Yazhong. Research on Temporal None Padding Network Video Action Recognition Algorithm[J]. Computer Engineering and Applications, 2023, 59(1): 162-168.
[1] 鹿天然,于凤芹,杨慧中,等.基于显著性检测和稠密轨迹的人体行为识别[J].计算机工程与应用,2018,54(14):163-167.
LU T R,YU F Q,YANG H Z,et al.Human action recognition based on dense trajectories with saliency detection[J].Computer Engineering and Applications,2018,54(14):163-167.
[2] KUEHNE H,JHUANG H,GARROTE E,et al.HMDB:A large video database for human motion recognition[C]//Proceedings of 2011 International Conference on Computer Vision,2011:2556-2563.
[3] SOOMRO K,ZAMIR A R,SHAH M.UCF101:A dataset of 101 human actions classes from videos in the wild[J].arXiv:1212.0402,2012.
[4] CABA HEILBRON F,ESCORCIA V,GHANEM B,et al.Activitynet:A large-scale video benchmark for human activity understanding[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2015:961-970.
[5] KARPATHY A,TODERICI G,SHETTY S,et al.Large-scale video classification with convolutional neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2014:1725-1732.
[6] KAY W,CARREIRA J,SIMONYAN K,et al.The kinetics human action video dataset[J].arXiv:1705.06950,2017.
[7] ABU-EL-HAIJA S,KOTHARI N,LEE J,et al.Youtube-8m:A large-scale video classification benchmark[J].arXiv:1609.08675,2016.
[8] GOYAL R,EBRAHIMI KAHOU S,MICHALSKI V,et al.The “something something” video database for learning and evaluating visual common sense[C]//Proceedings of the IEEE International Conference on Computer Vision,2017:5842-5850.
[9] JI S,XU W,YANG M,et al.3D convolutional neural networks for human action recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2012,35(1):221-231.
[10] TRAN D,BOURDEV L,FERGUS R,et al.Learning spatiotemporal features with 3D convolutional networks[C]//Proceedings of the IEEE International Conference on Computer Vision,2015:4489-4497.
[11] SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409. 1556,2014.
[12] HARA K,KATAOKA H,SATOH Y.Towards good practice for action recognition with spatiotemporal 3D convolutions[C]//Proceedings of the 24th International Conference on Pattern Recognition(ICPR),2018:2516-2521.
[13] HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:770-778.
[14] DENG J,DONG W,SOCHER R,et al.Imagenet:A large-scale hierarchical image database[C]//Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition,2009:248-255.
[15] VAROL G,LAPTEV I,SCHMID C.Long-term temporal convolutions for action recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,40(6):1510-1517.
[16] SUN L,JIA K,YEUNG D Y,et al.Human action recognition using factorized spatio-temporal convolutional networks[C]//Proceedings of the IEEE International Conference on Computer Vision,2015:4597-4605.
[17] QIU Z,YAO T,MEI T.Learning spatio-temporal representation with pseudo-3d residual networks[C]//Proceedings of the IEEE International Conference on Computer Vision,2017:5533-5541.
[18] TRAN D,WANG H,TORRESANI L,et al.A closer look at spatiotemporal convolutions for action recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:6450-6459.
[19] LIN J,GAN C,HAN S.Tsm:Temporal shift module for efficient video understanding[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision,2019:7083-7093.
[20] JIANG B,WANG M M,GAN W,et al.STM:Spatiotemporal and motion encoding for action recognition[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision,2019:2000-2009.
[21] LI Y,JI B,SHI X,et al.TEA:Temporal excitation and aggregation for action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2020:909-918.
[22] WANG L,XIONG Y,WANG Z,et al.Temporal segment networks:Towards good practices for deep action recognition[C]//Proceedings of the European Conference on Computer Vision,2016:20-36.
[23] 高大鹏,朱建刚.多维度自适应3D卷积神经网络原子行为识别[J].计算机工程与应用,2018,54(4):174-178.
GAO D P,ZHU J G.Atom action recognition by multi-dimensional adaptive 3D convolutional neural networks[J].Computer Engineering and Applications,2018,54(4):174-178.
[24] ZHU Y,LI X,LIU C,et al.A Comprehensive study of deep video action recognition[J].arXiv:2012.06567,2020.
[25] TRAN D,RAY J,SHOU Z,et al.Convnet architecture search for spatiotemporal feature learning[J].arXiv:1708.05038,2017.