计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (15): 177-186.DOI: 10.3778/j.issn.1002-8331.2204-0464

• 模式识别与人工智能 • 上一篇    下一篇

知识与数据驱动的多粒度中文文本情感分析

刘忠宝,王宇飞   

  1. 1.山东外国语职业技术大学 信息工程学院,山东 日照 276826
    2.中北大学 软件学院,太原 030051
  • 出版日期:2023-08-01 发布日期:2023-08-01

Multi-Granularity Chinese Text Sentiment Analysis Driven by Knowledge and Data

LIU Zhongbao, WANG Yufei   

  1. 1.School of Information Engineering, Shandong Vocational and Technical University of International Studies, Rizhao, Shandong 276826, China
    2.School of Software, North University of China, Taiyuan 030051, China
  • Online:2023-08-01 Published:2023-08-01

摘要: 近年来,中文文本情感分析研究取得了长足进步,但鲜有研究从语言间的差异性、领域知识的有效性和下游任务需求等方面进行探讨。鉴于此,针对中文文本的特殊性以及情感分析的实际需求,将情感知识三元组通过TransE模型得到的知识向量与双向门控循环单元、注意力机制等模型得到的特征向量进行深度融合,并在字、词特征的基础上,引入部首特征和情感词性特征,提出知识与数据协同驱动下融入字、词、部首、词性等多粒度语义特征的中文文本情感分析方法。豆瓣电影评论集和NLPECC数据集上的实验结果表明,所提方法能够有效利用情感知识与多粒度特征提升中文情感识别性能,其F1值分别达到了89.23%和84.84%,较好地完成了中文文本情感分析任务。

关键词: 中文文本, 知识图谱, 多粒度语义特征, 情感分析

Abstract: In recent years, researches on Chinese text sentiment analysis have made great progress, but few of which have explored the differences between languages, the effectiveness of domain knowledge, and downstream task requirement. In view of this, according to the particularity of Chinese texts and the actual needs of sentiment analysis, a Chinese text sentiment analysis method is proposed in this paper, which integrates multi-granularity semantic features such as characters, words, radicals, and parts of speech. This method deeply integrates the knowledge vectors obtained from the emotional knowledge triples through the TransE model, the feature vectors obtained by bidirectional gated recurrent unit(BiGRU) and attention mechanism, and introduces radical features and emotional part-of-speech features based on character and word features. The comparative experimental results on the Douban movie review dataset and the NLPECC dataset show that the method proposed in this paper can effectively use emotional knowledge and multi-granularity features to improve Chinese emotion recognition performance, and the F1 score of the method reaches 89.23% and 84.84% respectively, which completes the Chinese text sentiment analysis task well.

Key words: Chinese text, knowledge graph, multi-granularity semantic features, sentiment analysis