计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (23): 1-14.DOI: 10.3778/j.issn.1002-8331.2303-0254
金叶磊,古兰拜尔·吐尔洪,买日旦·吾守尔
出版日期:
2023-12-01
发布日期:
2023-12-01
JIN Yelei, Gulanbaier Tuerhong, Mairidan Wushouer
Online:
2023-12-01
Published:
2023-12-01
摘要: 多传感器数据融合技术下的情感分析技术是人机交互领域的热门研究方向。随着深度学习技术的发展,情感分析技术的研究也从传统的基于单一传感器数据的方法转向了基于多传感器数据融合的方法。从多传感器数据融合以及情感分析技术的定义和发展历程出发,介绍多传感器数据融合技术下的情感分析技术的研究现状和挑战;介绍了多传感器数据融合的经典模型和传统方法。阐述了目前国内外情感分析技术研究的主要方向和研究成果,其中包括基于如语音、视觉、文本和生理信息等数据的情感分析相关研究。介绍了基于多传感器数据融合技术的多模态情感分析方法,通过实验对比多模态情感分析与单模态的情感分类效果,并展望了多传感器数据融合技术下的情感分析技术的发展前景和可能的研究方向,其中包括跨语种情感分析及多模态情感分析技术的进一步应用和发展等。
金叶磊, 古兰拜尔·吐尔洪, 买日旦·吾守尔. 情感分析中的多传感器数据融合研究综述[J]. 计算机工程与应用, 2023, 59(23): 1-14.
JIN Yelei, Gulanbaier Tuerhong, Mairidan Wushouer. Review of Multimodal Sensor Data Fusion in Sentiment Analysis[J]. Computer Engineering and Applications, 2023, 59(23): 1-14.
[1] HALL D L,LLINAS J.An introduction to multisensor data fusion[J].Proceedings of the IEEE,1997,85(1):6-23. [2] CALVERT G A,THESEN T.Multisensory integration:methodological approaches and emerging principles in the human brain[J].Journal of Physiology-Paris,2004,98(1/2/3):191-205. [3] MACALUSO E,DRIVER J.Multisensory spatial interactions:a window onto functional integration in the human brain[J].Trends in Neurosciences,2005,28(5):264-271. [4] HACKETT J K,SHAH M.Multi-sensor fusion:a perspective[C]//Proceedings of the IEEE International Conference on Robotics and Automation,1990:1324-1330. [5] ARD R W,HEALEY J.Affective wearables[J].Personal Technologies,1997,1(4):231-240. [6] GRAVINA R,ALINIA P,GHASEMZADEH H,et al.Multi-sensor fusion in body sensor networks:state-of-the-art and research challenges[J].Information Fusion,2017,35:68-80. [7] KIRSCHSTEIN T,K?HLING R.What is the source of the EEG?[J].Clinical EEG and Neuroscience,2009,40(3):146-149. [8] MALIK M,CAMM A J.Heart rate variability[J].Clinical Cardiology,1990,13(8):570-576. [9] SHARMA M,KACKER S,SHARMA M.A brief introduction and review on galvanic skin response[J].Int J Med Res Prof,2016,2(6):13-17. [10] SWEENEY H L,HAMMERS D W.Muscle contraction[J].Cold Spring Harbor Perspectives in Biology,2018,10(2):023200. [11] BOS D O.EEG-based emotion recognition[J].The Influence of Visual and Auditory Stimuli,2006,56(3):1-17. [12] APPELHANS B M,LUECKEN L J.Heart rate variability as an index of regulated emotional responding[J].Review of General Psychology,2006,10(3):229-240. [13] LIM J Z,MOUNTSTEPHENS J,TEO J.Emotion recognition using eye-tracking:taxonomy,review and current challenges[J].Sensors,2020,20(8):2384. [14] NICHOLSON D,LLOYD C M,JULIER S J,et al.Scalable distributed data fusion[C]//Proceedings of the Fifth International Conference on Information Fusion,2002:630-635. [15] CASTANEDO F.A review of data fusion techniques[J].The Scientific World Journal,2013:704504. [16] 郝凯.基于多传感器数据融合的姿态识别算法的研究与实现[D].吉林:吉林大学,2020. HAO K.Research and implementation of attitude recognition algorithm based on multi-sensor data fusion[D].Jilin:Jilin University,2020. [17] 杨露菁,余华.多源信息融合理论与应用[M].北京:北京邮电大学出版社,2006. YANG L J,YU H.Theory and application of multi-source information fusion[M].Beijing University of Posts and Telecommunications Press,2006. [18] TER BRAAK C J F,LOOMAN C W N.Weighted averaging,logistic regression and the Gaussian response model[J].Vegetatio,1986,65(1):3-11. [19] ZHAO X,LUO Q,HAN B.Survey on robot multi-sensor information fusion technology[C]//2008 7th World Congress on Intelligent Control and Automation,2008:5019-5023. [20] SHAFER G.Dempster-shafer theory[J].Encyclopedia of Artificial Intelligence,1992,1:330-331. [21] 易建政,汪金军,张俊坤,等.D-S证据理论在信息融合中的应用研究[J].国外电子测量技术,2010(12):31-34. YI J Z,WANG J J,ZHANG J K,et al.Application research on D-S evidence theory in data fusion[J].Foreign Electronic Measurement Technology,2010(12):31-34. [22] YANG M S.A survey of fuzzy clustering[J].Mathematical and Computer Modelling,1993,18(11):1-16. [23] 毛健,赵红东,姚婧婧.人工神经网络的发展及应用[J].电子设计工程,2011,19(24):62-65. MAO J,ZHAO H D,YAO J J.Application and prospect of artificial neural network[J].Electronic Design Engineering,2011,19(24):62-65. [24] 张红,程传祺,徐志刚,等.基于深度学习的数据融合方法研究综述[J].计算机工程与应用,2020,56(24):1-11. ZHANG H,CHENG C Q,XU Z G,et al.Survey of data fusion based on deep learning[J].Computer Engineering and Applications,2020,56(24):1-11. [25] LIU Y,ZHANG C,CHENG J,et al.A multi-scale data fusion framework for bone age assessment with convolutional neural networks[J].Computers in Biology and Medicine,2019,108:161-173. [26] WU J,HU K,CHENG Y,et al.Data-driven remaining useful life prediction via multiple sensor signals and deep long short-term memory neural network[J].ISA Transactions,2019,97:241-250. [27] LI S,WANG H,SONG L,et al.An adaptive data fusion strategy for fault diagnosis based on the convolutional neural network[J].Measurement,2020,165:108122. [28] 林加润.基于多传感器数据融合的攻击检测与意图识别方法研究[D].北京:国防科学技术大学,2010. LIN J R.Research on attack detection and intent recognition based on multi-sensor data fusion[D].Beijing:National University of Defense Technology,2010. [29] 黄鹍.基于智能技术的多源信息融合理论与应用研究[D].南京:东南大学,2004. HUANG K,Research on the theory and applications of multi-source[D].Nanjing:Southeast University,2004. [30] 李娟,李甦,李斯娜,等.多传感器数据融合技术综述[J].云南大学学报(自然科学版),2008,30(S2):241-246. LI J,LI S,LI S N,et al.A survey of multi-sensor data fusion technology[J].Journal of Yunnan Universitt,2008,30(S2):241-246. [31] 张丽霞,曾广平,宣兆成.多源图像融合方法的研究综述[J].计算机工程与科学,2022,44(2):321-334. ZHANG L X,ZENG G P,XUAN Z C.A survey of fusion methods for multi-source image[J].Computer Engineering & Science,2022,44(2):321-334. [32] SWAIN M,ROUTRAY A,KABISATPATHY P.Databases,features and classifiers for speech emotion recognition:a review[J].International Journal of Speech Technology,2018,21:93-120. [33] MCFEE B,RAFFEL C,LIANG D,et al.Librosa:audio and music signal analysis in Python[C]//Proceedings of the 14th Python in Science Conference,2015:18-25. [34] EYBEN F,SCHULLER B.Opensmile:the munich open-source large-scale multimedia feature extractor[J].ACM SIG Multimedia Records,2015,6(4):4-13. [35] ZHANG S,ZHANG S,HUANG T,et al.Speech emotion recognition using deep convolutional neural network and discriminant temporal pyramid matching[J].IEEE Transactions on Multimedia,2017,20(6):1576-1590. [36] SAJJAD M,KWON S.Clustering-based speech emotion recognition by incorporating learned features and deep BiLSTM[J].IEEE Access,2020,8:79861-79875. [37] MIRSAMADI S,BARSOUM E,ZHANG C.Automatic speech emotion recognition using recurrent neural networks with local attention[C]//IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),LA,USA,2017:2227-2231. [38] SARMA M,GHAHREMANI P,POVEY D,et al.Emotion identification from raw speech signals using DNNs[C]//International Speech Communication Association,Hyderabad,India,2018:3097-3101. [39] ZHAO J,MAO X,CHEN L.Speech emotion recognition using deep 1D&2D CNN LSTM networks[J].Biomedical Signal Processing and Control,2019,47:312-323. [40] SENGUPTA A,YE Y,WANG R,et al.Going deeper in spiking neural networks:VGG and residual archi-tectures[J].Frontiers in Neuroscience,2019:02627. [41] TARG S,ALMEIDA D,LYMAN K.Resnet in resnet:generalizing residual architectures[J].arXiv:1603.08029,2016. [42] SZEGEDY C,VANHOUCKE V,IOFFE S,et al.Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:2818-2826. [43] HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:770-778. [44] VAN GENT P.Emotion recognition using facial land-marks python DLib and OpenCV[J].A Tech Blog About Fun Things with Python Embedded Electronics,2016. [45] KUSUMA G P,JONATHAN J,LIM A P.Emotion recognition on FER-2013 face images using fine-tuned VGG-16[J].Advances in Science,Technology and Engineering Systems Journal,2020,5(6):315-322. [46] LI B,LIMA D.Facial expression recognition via ResNet-50[J].International Journal of Cognitive Computing in Engineering,2021,2:57-64. [47] WANG S,JI Q.Video affective content analysis:a survey of state-of-the-art methods[J].IEEE Transactions on Affective Computing,2015,6(4):410-430. [48] COHEN I,SEBE N,GARG A,et al.Facial expression recognition from video sequences[C]//Proceedings of the IEEE International Conference on Multimedia and Expo,2002,2:121-124. [49] ZHANG S,PAN X,CUI Y,et al.Learning affective video features for facial expression recognition via hybrid deep learning[J].IEEE Access,2019,7:32297-32304. [50] LIU M,SHAN S,WANG R,et al.Learning expressionlets on spatio-temporal manifold for dynamic facial expression recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2014:1749-1756. [51] LIU Y,FENG C,YUAN X,et al.Clip-aware expressive feature learning for video-based facial expression recognition[J].Information Sciences,2022,598:182-195. [52] ZHANG Y,JIN R,ZHOU Z H.Understanding bag-of-words model:a statistical framework[J].International Journal of Machine Learning and Cybernetics,2010,1(1/2/3/4):43-52. [53] ROELLEKE T,WANG J.TF-IDF uncovered:a study of theories and probabilities[C]//Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,2008:435-442. [54] LI Y,YANG T.Word embedding for understanding natural language:a survey[J].Guide to Big Data Applications,2018:83-104. [55] DOS SANTOS C,GATTI M.Deep convolutional neural networks for sentiment analysis of short texts[C]//Proceedings of COLING 2014,the 25th International Conference on Computational Linguistics:Technical Papers,2014:69-78. [56] KENTON J D M W C,TOUTANOVA L K.BERT:pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of NAACL-HLT,2019:4171-4186. [57] RAFFEL C,SHAZEER N,ROBERTS A,et al.Exploring the limits of transfer learning with a unified text-to-text transformer[J].The Journal of Machine Learning Research,2020,21(1):5485-5551. [58] YANG Z,YANG D,DYER C,et al.Hierarchical attention networks for document classification[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,2016:1480-1489. [59] HAZARIKA D,GORANTLA S,PORIA S,et al.Self-attentive feature-level fusion for multimodal emotion detection[C]//2018 IEEE Conference on Multimedia Information Processing and Retrieval(MIPR),2018:196-201. [60] ELFAIK H.Leveraging feature-level fusion representa-tions and attentional bidirectional RNN-CNN deep models for Arabic affect analysis on Twitter[J].Journal of King Saud University-Computer and Information Sciences,2023,35(1):462-482. [61] PAPPAGARI R,WANG T,VILLALBA J,et al.X-vectors meet emotions:a study on dependencies between emotion and speaker recognition[C]//IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),Barcelona,Spain,2020:7169-7173. [62] ABDULLAH S M S A,AMEEN S Y A,SADEEQ M A M,et al.Multimodal emotion recognition using deep learning[J].Journal of Applied Science and Technology Trends,2021,2(2):52-58. [63] LIU Z,SHEN Y,LAKSHMINARASIMHAN V B,et al.Efficient low-rank multimodal fusion with modality-specific factors[J].arXiv:1806.00064,2018. [64] ZADEH A,CHEN M,PORIA S,et al.Tensor fusion network for multimodal sentiment analysis[J].arXiv:1707. 07250,2017. [65] SAHOO S,ROUTRAY A.Emotion recognition from audio-visual data using rule based decision level fusion[C]//2016 IEEE Students’ Technology Symposium(TechSym),2016:7-12. [66] XIE J,XU X,SHU L.WT feature based emotion recognition from multi-channel physiological signals with decision fusion[C]//2018 First Asian Conference on Affective Computing and Intelligent Interaction(ACII Asia),2018:1-6. [67] ZHANG Q,ZHANG H,ZHOU K,et al.Developing a physiological signal-based,mean threshold and decision-level fusion algorithm(PMD) for emotion recognition[J].Tsinghua Science and Technology,2023,28(4):673-685. [68] YAN M S,DENG Z,HE B W,et al.Emotion classification with multichannel physiological signals using hybrid feature and adaptive decision fusion[J].Biomedical Signal Processing and Control,2022,71:103235. [69] ZADEH A,LIANG P P,MAZUMDER N,et al.Memory fusion network for multi-view sequential learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2018. [70] ZHANG Z,CHEN K,WANG R,et al.Neural machine translation with universal visual representation[C]//International Conference on Learning Representations,2020. [71] 韩天翊,林荣恒.一种基于决策层融合的多模态情感识别方法[J].南京师范大学学报(工程技术版),2022,22(2):35-40. HAN T Y,LIN R H.A multimodal emotion recognition method based on decision level fusion[J].Journal of Nanjing Normal University(Engineering and Technology Edition),2022,22(2):35-40. [72] ZHANG S,ZHANG S,HUANG T,et al.Learning affec-tive features with a hybrid deep model for audio-visual emotion recognition[J].IEEE Transactions on Circuits & Systems for Video Technology,2017,28(10):673-685. [73] ZHENG W,LIU W,LU Y,et al.Emotionmeter:a mul timodal framework for recognizing human emotions[J].IEEE Transactions on Cybernetics,2018,49(3):1110-1122. [74] MIDDYA A I,NAG B,ROY S.Deep learning based multi-modal emotion recognition using model-level fusion of audio-visual modalities[J].Knowledge-Based Systems,2022,244:108580. [75] XU H,ZHANG H,HAN K,et al.Learning alignment for multimodal emotion recognition from speech[J].arXiv:1909.05645,2019. [76] ZHENG W L,LU B L.Investigating critical frequency bands and channels for EEG-based emotion recognition with deep neural networks[J].IEEE Transactions on Autonomous Mental Development,2015,7(3):162-175. [77] NOJAVANASGHARI B,BALTRU?AITIS T,HUGHES C E,et al.Emoreact:a multimodal approach and dataset for recognizing emotional responses in children[C]//Proceedings of the 18th ACM International Conference on Multimodal Interaction,2016:137-144. [78] ZHANG Y,LAI G,ZHANG M,et al.Explicit factor models for explainable recommendation based on phrase-level sentiment analysis[C]//Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval,2014:83-92. [79] PORIA S,HAZARIKA D,MAJUMDER N,et al.Meld:a multimodal multi-party dataset for emotion recognition in conversations[J].arXiv:1810.02508,2018. [80] JACKSON P,HAQ S.Surrey audio-visual expressed emotion(SAVEE) database[J].University of Surrey:guild-ford,UK,2014. [81] LIVINGSTONE S R,RUSSO F A.The ryerson au-dio-visual data-base of emotional speech and song(RAVDESS):a dynamic,multimodal set of facial and vocal expressions in north American english[J].PloS One,2018,13(5):e0196391. [82] AL-AZANI S,EL-ALFY E S M.Enhanced video analytics for sentiment analysis based on fusing textual,auditory and visual information[J].IEEE Access,2020,8:136843-136857. [83] BARROS P,CHURAMANI N,LAKOMKIN E,et al.The OMG-emotion behavior dataset[C]//2018 International Joint Conference on Neural Networks(IJCNN),2018:1-7. [84] YU W,XU H,MENG F,et al.CH-SIMS:a chinese multimodal sentiment analysis dataset with fine-grained annotation of modality[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics,2020:3718-3727. [85] ZADEH A A B,LIANG P P,PORIA S,et al.Multimodal language analysis in the wild:cmu-mosei dataset and interpretable dynamic fusion graph[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics,2018:2236-2246. [86] CERVELLINI P,MENEZES A G,MAGO V K.Finding trendsetters on yelp dataset[C]//2016 IEEE Symposium Series on Computational Intelligence(SSCI),2016:1-7. [87] BUSSO C,BULUT M,LEE C C,et al.IEMOCAP:interactive emotional dyadic motion capture database[J].Language Resources and Evaluation,2008,42:335-359. [88] RINGEVAL F,SONDEREGGER A,SAUER J,et al.Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions[C]//2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition(FG),2013:1-8. [89] YANG Q,PENG G,NGUYEN D T,et al.A multimodal data set for evaluating continuous authentication performance in smartphones[C]//Proceedings of the 12th ACM Conference on Embedded Network Sensor Systems,2014:358-359. [90] PLACIDI G,DI GIAMBERARDINO P,PETRACCA A,et al.Classification of Emotional Signals from the DEAP dataset[C]//Proceedings of the Neurotechnix,2016:15-21. [91] BILAKHIA S,PETRIDIS S,NIJHOLT A,et al.The MAHNOB Mimicry Database:a database of naturalistic human interactions[J].Pattern Recognition Letters,2015,66:52-61. |
[1] | 刘华玲, 陈尚辉, 乔梁, 刘雅欣. 多模态混合注意力机制的虚假新闻检测研究[J]. 计算机工程与应用, 2023, 59(9): 95-103. |
[2] | 蔡正奕, 赵杰煜, 朱峰. 融合图像特征的单阶段点云目标检测[J]. 计算机工程与应用, 2023, 59(9): 140-149. |
[3] | 罗会兰, 陈翰. 时空卷积注意力网络用于动作识别[J]. 计算机工程与应用, 2023, 59(9): 150-158. |
[4] | 周润民, 胡旭耀, 吴克伟, 于磊, 谢昭, 江龙. 基于交叉注意力的方面级情感分析[J]. 计算机工程与应用, 2023, 59(9): 190-197. |
[5] | 谢椿辉, 吴金明, 徐怀宇. 改进YOLOv5的无人机影像小目标检测算法[J]. 计算机工程与应用, 2023, 59(9): 198-206. |
[6] | 孙扬, 韩磊, 王程庆, 李韵鹏. 采用双支路与特征融合网络的路沿分割[J]. 计算机工程与应用, 2023, 59(9): 255-261. |
[7] | 郭平秀, 李启南, 杨忠鹏. 一种图像增强及改进海洋生物图像检测算法[J]. 计算机工程与应用, 2023, 59(8): 208-216. |
[8] | 田雪伟, 汪佳丽, 陈明, 杜守庆. 改进SegFormer网络的遥感图像语义分割方法[J]. 计算机工程与应用, 2023, 59(8): 217-226. |
[9] | 张朝阳, 张上, 王恒涛, 冉秀康. 多尺度下遥感小目标多头注意力检测[J]. 计算机工程与应用, 2023, 59(8): 227-238. |
[10] | 李绍华, 郭禾, 冯晶莹, 贺维. 传感器网络的最优维护决策模型[J]. 计算机工程与应用, 2023, 59(8): 315-321. |
[11] | 吴浩原, 熊辛, 闵卫东, 赵浩宇, 汪文翔. 基于多级特征融合和时域扩展的行为识别方法[J]. 计算机工程与应用, 2023, 59(7): 134-142. |
[12] | 韩宗桓, 刘名果, 李珅, 陈立家, 田敏, 兰天翔, 梁倩. 多尺度特征融合与新型判别器的无监督分割[J]. 计算机工程与应用, 2023, 59(7): 152-162. |
[13] | 李卓容, 唐云祁. 基于深度学习的多模态生物特征融合模型[J]. 计算机工程与应用, 2023, 59(7): 180-189. |
[14] | 罗盆琳, 方艳红, 李鑫, 李雪. RGB-D双模态特征融合语义分割[J]. 计算机工程与应用, 2023, 59(7): 222-231. |
[15] | 赵宏伟, 郑嘉俊, 赵鑫欣, 王胜春, 李浥东. 基于双模态深度学习的钢轨表面缺陷检测方法[J]. 计算机工程与应用, 2023, 59(7): 285-293. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||