Computer Engineering and Applications ›› 2023, Vol. 59 ›› Issue (3): 127-134.DOI: 10.3778/j.issn.1002-8331.2108-0016
• Pattern Recognition and Artificial Intelligence • Previous Articles Next Articles
LI Zhilei, LI Jun, SHI Zhiping, JIANG Na, ZHANG Yongkang
Online:
2023-02-01
Published:
2023-02-01
栗志磊,李俊,施智平,姜那,张永康
LI Zhilei, LI Jun, SHI Zhiping, JIANG Na, ZHANG Yongkang. Efficient 2D Temporal Modeling Network for Video Action Recognition[J]. Computer Engineering and Applications, 2023, 59(3): 127-134.
栗志磊, 李俊, 施智平, 姜那, 张永康. 用于视频行为识别的高效二维时序建模网络[J]. 计算机工程与应用, 2023, 59(3): 127-134.
[1] KARPATHY A,TODERICI G,SHETTY S,et al.Large-scale video classification with convolutional neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2014:1725-1732. [2] SIMONYAN K,ZISSERMAN A.Two-stream convolutional networks for action recognition in videos[C]//Proceedings of the Conference and Workshop on Neural Information Processing Systems,2014:568-576. [3] TRAN D,BOURDEV L,FERGUS R,et al.Learning spatiotemporal features with 3d convolutional networks[C]//Proceedings of the IEEE International Conference on Computer Vision,2015:4489-4497. [4] FEICHTENHOFER C,FAN H,MALIK J,et al.Slowfast networks for video recognition[C]//Proceedings of the IEEE International Conference on Computer Vision,2019:6202-6211. [5] SUN L,JIA K,YEUNG D Y,et al.Human action recognition using factorized spatio-temporal convolutional networks[C]//Proceedings of the IEEE International Conference on Computer Vision,2015:4597-4605. [6] BILEN H,FERNANDO B,GAVVES E,et al.Dynamic image networks for action recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:3034-3042. [7] BILEN H,FERNANDO B,GAVVES E,et al.Action recognition with dynamic image networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,40(12):2799-2813. [8] VAROL G,LAPTEV I,SCHMID C.Long-term temporal convolutions for action recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,40(6):1510-1517. [9] WANG L,XIONG Y,WANG Z,et al.Temporal segment networks:towards good practices for deep action recognition[C]//Proceedings of the European Conference on Computer Vision,2016:20-36. [10] ZACH C,POCK T,BISCHOF H.A duality based approach for realtime tv-l 1 optical flow[C]//Joint Pattern Recognition Symposium.Berlin,Heidelberg:Springer,2007:214-223. [11] QIU Z,YAO T,NGO C W,et al.Learning spatio-temporal representation with local and global diffusion[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2019:12056-12065. [12] LI Y,SONG S,LI Y,et al.Temporal bilinear networks for video action recognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2019:8674-8681. [13] GIRDHAR R,RAMANAN D,GUPTA A,et al.Learning spatio-temporal aggregation for action classification[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2017:971-980. [14] LIN J,GAN C,HAN S.TSM:temporal shift module for efficient video understanding[C]//Proceedings of the IEEE International Conference on Computer Vision,2019:7083-7093. [15] HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:770-778. [16] ZHAO Y,XIONG Y,LIN D.Recognize actions by disentangling components of dynamics[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:2204-2215. [17] ZHAO Y,XIONG Y,LIN D.Trajectory convolution for action recognition[C]//Proceedings of the Conference and Workshop on Neural Information Processing Systems,2018:2204-2215. [18] WANG X,GUPTA A.Videos as space-time region graphs[C]//Proceedings of the European Conference on Computer Vision,2018:399-417. [19] ZOLFAGHARI M,SINGH K,BROX T.Efficient convolutional network for online video understanding[C]// Proceedings of the European Conference on Computer Vision,2018:695-712. [20] XIE S,SUN C,HUANG J,et al.Rethinking spatiotemporal feature learning:speed-accuracy trade-offs in video classification[C]//Proceedings of the European Conference on Computer Vision,2018:318-335. [21] HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780. [22] YAN S,XIONG Y,LIN D.Spatial temporal graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2018:7444-7452. [23] JIANG B,WANG M,GAN W,et al.STM:spatiotemporal and motion encoding for action recognition[C]//Proceedings of the IEEE International Conference on Computer Vision,2019:2000-2009. [24] LIU Z,LUO D,WANG Y,et al.Towards an efficient architecture for video recognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2020:11669-11676. [25] LIU Z,WANG L,WU W,et al.TAM:temporal adaptive module for video recognition[C]//Proceedings of the International Conference on Machine Learning,2021. [26] WU L,ZOU Y,ZHANG C.Long-short temporal modeling for efficient action recognition[C]//Proceedings of the IEEE International Conference on Acoustics,Speech and Signal Processing,2021:2435-2439. [27] LIU X,PINTEA S L,NEJADASL F K,et al.No frame left behind:full video action recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2021:14892-14901. [28] HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:7132-7141. [29] HOU Q,ZHOU D,FENG J.Coordinate attention for efficient mobile network design[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2021:13713-13722. [30] SOOMRO K,ZAMIR A R,SHAH M.A dataset of 101 human action classes from videos in the wild[J].Center for Research in Computer Vision,2012,2(11). [31] KUEHNE H,JHUANG H,GARROTE E,et al.Hmdb:a large video database for human motion recognition[C]//Proceedings of the IEEE International Conference on Computer Vision,2011:2556-2563. [32] GOYAL R,EBRAHIMI K S,MICHALSKI V,et al.The “something something” video database for learning and evaluating visual common sense[C]//Proceedings of the IEEE International Conference on Computer Vision,2017:5842-5850. [33] LUO C,YUILLE A L.Grouped spatial-temporal aggregation for efficient action recognition[C]//Proceedings of the IEEE International Conference on Computer Vision,2019:5512-5521. [34] FAN Q,CHEN C F,KUEHNE H,et al.More is less:learning efficient video representations by big-little network and depthwise temporal aggregation[C]//Proceedings of the Conference and Workshop on Neural Information Processing Systems,2019:2261-2270. [35] LI Y,JI B,SHI X,et al.Tea:temporal excitation and aggregation for action recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2020:909-918. [36] FAN L,BUCH S,WANG G,et al.Rubiksnet:learnable 3d-shift for efficient video action recognition[C]//Proceedings of the European Conference on Computer Vision,2020:505-521. [37] CARREIRA J,ZISSERMAN A.Quo vadis,action recognition? a new model and the kinetics dataset[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2017:6299-6308. [38] SELVARAJU R R,COGSWELL M,DAS A,et al.Grad-cam:visual explanations from deep networks via gradient-based localization[C]//Proceedings of the IEEE International Conference on Computer Vision,2017:618-626. |
[1] | LIU Zhao, YANG Fan, SI Yazhong. Research on Temporal None Padding Network Video Action Recognition Algorithm [J]. Computer Engineering and Applications, 2023, 59(1): 162-168. |
[2] | ZHOU Hang, LIU Yuxi, GONG Yue, KOU Fuwei, XU Guoliang. Action Recognition Algorithm Based on Dense Trajectories and Optical Flow Binarization Image [J]. Computer Engineering and Applications, 2022, 58(20): 174-180. |
[3] | ZHANG Fukai, HE Tiancheng. Action Recognition Combined with Lightweight Openpose and Attention-Guided Graph Convolution [J]. Computer Engineering and Applications, 2022, 58(18): 180-187. |
[4] | WANG Ziru, LI Zhenmin. Transferable Dictionary Learning Fused Data Augmentation [J]. Computer Engineering and Applications, 2021, 57(23): 193-199. |
[5] | CHEN Yanjie, SHU Dawei, YANG Jijiang, WANG Huan, WANG Qing, LEI Yi. Review of AI Diagnosis System of Developmental Coordination Disorder [J]. Computer Engineering and Applications, 2021, 57(2): 28-36. |
[6] | ZHOU Xiaojing, CHEN Junhong, YANG Zhenguo, LIU Wenyin. Manipulation Action Recognition Based on Gesture Feature Fusion [J]. Computer Engineering and Applications, 2021, 57(14): 169-175. |
[7] | LIU Jing, YANG Xu, LIU Dongjingdian, NIU Qiang. Multi-person Smoking Action Recognition Algorithm Based on Human Joint Points [J]. Computer Engineering and Applications, 2021, 57(1): 234-241. |
[8] | LI Yuanxiang, XIE Linbo. Human Action Recognition Based on Depth Motion Map and Dense Trajectory [J]. Computer Engineering and Applications, 2020, 56(3): 194-200. |
[9] | SANG Haifeng, TIAN Qiuyang. Rapid Action Recognition System for Human-Computer Interaction [J]. Computer Engineering and Applications, 2019, 55(6): 101-107. |
[10] | GE Yun1, JING Guodong2. Human Action Recognition Based on Convolution Neural Network Combined with Multi-Scale Method [J]. Computer Engineering and Applications, 2019, 55(2): 100-103. |
[11] | YANG Shiqiang, LUO Xiaoyu, LI Xiaoli, YANG Jiangtao, LI Dexin. Human Action Recognition Based on DBN-HMM [J]. Computer Engineering and Applications, 2019, 55(15): 169-176. |
[12] | ZHAO Xiaoli, TIAN Lihua, LI Chen. Action recognition method based on sparse coding local spatio-temporal descriptors [J]. Computer Engineering and Applications, 2018, 54(7): 29-35. |
[13] | GAO Dapeng, ZHU Jiangang. Atom action recognition by multi-dimensional adaptive 3D convolutional neural networks [J]. Computer Engineering and Applications, 2018, 54(4): 174-178. |
[14] | ZHU Dayong1,2, GUO Xing1,2, WU Jianguo1,2. Action recognition method using kinect 3D skeleton data [J]. Computer Engineering and Applications, 2018, 54(20): 152-158. |
[15] | LU Tianran, YU Fengqin, YANG Huizhong, CHEN Ying. Human action recognition based on dense trajectories with saliency detection [J]. Computer Engineering and Applications, 2018, 54(14): 163-167. |
Viewed | ||||||||||||||||||||||||||||||||||||||||||||||||||
Full text 65
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||
Abstract |
|
|||||||||||||||||||||||||||||||||||||||||||||||||