Computer Engineering and Applications ›› 2020, Vol. 56 ›› Issue (24): 151-156.DOI: 10.3778/j.issn.1002-8331.1909-0273

Previous Articles     Next Articles

Transformer-Capsule Integrated Model for Text Classification

TANG Zhuang, WANG Zhishu, ZHOU Ai, FENG Meishan, QU Wen, LU Mingyu   

  1. College of Information Science and Technology, Dalian Maritime University, Dalian, Liaoning 116026, China
  • Online:2020-12-15 Published:2020-12-15

面向文本分类的transformer-capsule集成模型

唐庄,王志舒,周爱,冯美姗,屈雯,鲁明羽   

  1. 大连海事大学 信息科学技术学院,辽宁 大连 116026

Abstract:

Aiming at the problem that shallow single-model text classification algorithms cannot extract the multi-level features of text sequence well, this paper proposes a transformer-capsule integrated model, which uses capsule network and transformer to extract the local phrase features and global semantic features of text respectively. Through integration, the multi-level features of the text sequence is obtained more comprehensively. In addition, this paper proposes a dynamic routing algorithm based on attention mechanism to solve the interferences of noisy capsules in traditional dynamic routing, which assigns less weight to the noisy capsules, reduces interfering information transmitted to subsequent capsules, and experiments show that this mechanism can effectively improve classification performance. In this paper, four single-label datasets and one multi-label Reuters-21578 dataset of text classification are selected for experiments, and good experimental results are obtained. The F1 value on Reuters-21578 is increased by 3.6% compared with the Capsule-B model, it reaches 89.4%.

Key words: text classification, transformer, capsule network, integrated model

摘要:

针对浅层的单模型文本分类算法不能很好地提取到文本序列多层次特征的问题,提出一种transformer-capsule集成模型,分别利用胶囊网络(capsule network)和transformer来提取文本的局部短语特征和全局语义特征,通过集成的形式更全面地得到文本序列的多层次特征表示。此外,针对传统胶囊网络动态路由时存在部分噪音胶囊干扰的问题,提出基于注意力机制的动态路由算法,赋给噪音胶囊较小的权重,减少传递给后续胶囊的干扰信息,实验证明该机制能有效提高分类性能。选取文本分类通用语料库中4个单标签数据集和1个多标签Reuters-21578数据集进行实验,取得了较好的实验结果,其中在Reuters-21578上F1值相比Capsule-B模型提升了3.6%,达到了89.4%。

关键词: 文本分类, transformer, 胶囊网络, 集成模型