计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (13): 227-234.DOI: 10.3778/j.issn.1002-8331.2404-0236

• 模式识别与人工智能 • 上一篇    下一篇

融合字符与词语特征的混合神经网络情感分析模型

李嘉琦,杨环,高辉   

  1. 1.电子科技大学 计算机科学与工程学院,成都 611731
    2.喀什地区电子信息产业技术研究院,新疆 喀什 844000
  • 出版日期:2025-07-01 发布日期:2025-06-30

Hybrid Neural Network Sentiment Analysis Model Incorporating Character and Word Features

LI Jiaqi, YANG Huan, GAO Hui   

  1. 1.School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
    2.Kash Institute of Electronics and Information Industry, Kash, Xinjiang 844000, China
  • Online:2025-07-01 Published:2025-06-30

摘要: 汉语语句中没有明显的分隔符,这导致传统基于词语划分的编码模型可能会丢失句子内部的语义信息,特别是在处理情感分析等任务时显得更为复杂。为克服这一难题,借鉴字符级和词语级特征融合的思路,提出了一种融合特征模型。该模型将句子划分为两种级别的编码,采用Bi-GRU结构提取字符序列中的包含上下文信息的特征关系,并引入注意力机制,使用CNN网络结构提取词语之间的局部特征关系,利用不同大小的卷积核获得不同距离的局部特征,最后将二者特征进行融合,获得全局特征信息。在三个公开数据集Weibo、CIN和Chnsenticorp上的准确率分别达到了81.32%、76.03%和96.28%,相比于以字符编码为基础的MCNN-IFGS模型,分别提高了1.02个百分点、0.13个百分点和1.05个百分点,结果表明在中文情感分析任务中,融合特征模型的表现明显优于单独使用字符级或词语级特征的模型,能够显著提升模型的性能和鲁棒性,更有效地提取文本的语义信息。

关键词: 情感分析, 混合神经网络, 字符特征, 词语特征, 双向门控循环单元

Abstract: Chinese sentences lack explicit delimiters, making traditional word-based encoding models prone to losing internal semantic information—especially in sentiment analysis tasks. To address this, a feature fusion model is proposed that integrates character-level and word-level representations. It uses a Bi-GRU with attention to capture contextual features from character sequences and a CNN with multiple kernel sizes to extract local word-level features. The fused features yield comprehensive global representations. Tested on three public datasets—Weibo, CIN, and Chnsenticorp, the model achieves accuracies of 81.32%, 76.03%, and 96.28%, respectively, outperforming the character-based MCNN-IFGS model by 1.02, 0.13, and 1.05 percentage points. These results show that the fusion approach significantly enhances performance and robustness in Chinese sentiment analysis by effectively capturing semantic information.

Key words: sentiment analysis, hybrid neural networks, character features, word features, bidirectional gated recurrent units