计算机工程与应用 ›› 2021, Vol. 57 ›› Issue (23): 211-218.DOI: 10.3778/j.issn.1002-8331.2006-0324

• 模式识别与人工智能 • 上一篇    下一篇

结合集成学习的序贯三支情感分类方法研究

王琴,刘盾   

  1. 西南交通大学 经济管理学院,成都 610031
  • 出版日期:2021-12-01 发布日期:2021-12-02

Sequential Three-Way Sentiment Classification Combined with Ensemble Learning

WANG Qin, LIU Dun   

  1. School of Economics and Management, Southwest Jiaotong University, Chengdu 610031, China
  • Online:2021-12-01 Published:2021-12-02

摘要:

情感分类一直是自然语言处理任务中重要的研究热点,并在电子商务评论、热点论坛、公共舆论等众多场景中广泛应用。如何提高情感分类模型性能仍是情感分析领域的重点研究问题。集成学习是通过联合若干分类器达到提高模型总体效果的有效方法。基于粒计算和三支决策思想,并结合集成学习的优势,构建了结合集成学习的多粒度序贯三支决策模型。通过N-gram语言模型构建文本多粒度结构,形成序贯三支情感分类基础;在每一粒度下,集成三个分类算法以提高在该粒度下的分类效果;通过4个数据集对所提出方法进行了实验验证。结果证明,该方法不仅可以提高整体分类效果,还可以降低分类成本。

关键词: 情感分类, 序贯三支决策, 多粒度, 集成学习

Abstract:

Sentiment classification has always been an important research hotspot in natural language processing tasks, and is widely used in many scenarios such as e-commerce reviews, hotspot forums and public opinion. How to improve the performance of sentiment classification model is still a key research problem in the field of sentiment analysis. Ensemble learning is an effective method to improve the overall performance of the model by combining several classifiers. Based on the ideas of granular computing and three-way decisions, as well as the advantages of ensemble learning, this paper constructs a multi-granularity sequential three-way decision model combined with ensemble learning. Firstly, it builds a text multi-granularity structure through the N-gram language model to form the basis of sequential three-way sentiment classification. Secondly, three classification algorithms are designed to improve the classification performance at each granularity. Finally, it verifies the model through four datasets. The results demonstrate the proposed method can not only improve the overall classification performance, but also reduce the classification cost.

Key words: sentiment classification, sequential three-way decisions, multi-granularity, ensemble learning