结合Bert与超图卷积网络的文本分类模型

doi:10.3778/j.issn.1002-8331.2211-0397

摘要/Abstract

摘要： 现有的图神经网络在处理文本分类问题时，通常将文本转化为图结构，然后利用图神经网络进行学习表示，从而分类。但基于图神经网络的方法存在两点不足，一是图结构利用二元关系对单词联系进行表示，缺乏对文本高阶关系的捕获；二是图神经网络类模型难以捕获文本中丰富的语义关联。针对以上问题，提出了一种采用注意力机制将Bert与超图卷积网络结合的文本分类模型。通过Bert模型获得文本中的局部语义信息，通过构建文本超图获得单词间更广泛的文本关联信息，并经过超图卷积网络的学习表示得到文本全局结构特征，利用注意力机制对两种特征进行交互影响，获得更全面更充分的文本表示。在4个公开数据集上进行的多次实验表明，该模型与基线模型相比有更好的分类效果。

关键词: 文本分类, Bert, 超图, 神经网络

Abstract: When the existing graph neural networks deal with the problem of text classification, the text is usually transformed into graph structure, and then the graph neural network is used to learn and obtain the final classifications. However, the methods based on graph neural networks have two shortcomings. One is that graph structure uses binary connection to represent word relations, and it lacks the ability of capturing higher-order relations in the text. The second is that it is difficult for the graph neural networks to capture the rich semantic associations. To solve the above problems, a text classification model is proposed to integrate Bert with hypergraph convolutional network using attention mechanism. Local semantic information in the text is obtained by using the Bert model. Text association information between words is obtained by constructing the text hypergraph, and the global structure features are obtained by learning of the hypergraph convolutional network. Attention mechanism is used to interact with the two features to obtain a more comprehensive and adequate text representation. Extensive experiments on the four public datasets show that the model outperforms the baseline models.

Key words: text classification, Bert, hypergraph, neural network

李全鑫, 庞俊, 朱峰冉. 结合Bert与超图卷积网络的文本分类模型[J]. 计算机工程与应用, 2023, 59(17): 107-115.

LI Quanxin, PANG Jun, ZHU Fengran. Text Classification Method Based on Integration of Bert and Hypergraph Convolutional Network[J]. Computer Engineering and Applications, 2023, 59(17): 107-115.

参考文献

[1] MOHANA R S，RAJATHI K，KOUSALYA K，et al.Text sentiment analysis on E-shopping product reviews using chaotic coyote optimized deep belief network approach[J].Concurrency and Computation：Practice and Experience，2022，34（10）：e7039.
[2] MINAEE S，KALCHBRENNER N，CAMBRIA E，et al.Deep learning-based text classification：a comprehensive review[J].ACM Computing Surveys，2021，54（3）：1-40.
[3] LI Q，PENG H，LI J X，et al.A survey on text classification：from traditional to deep learning[J].ACM Transactions on Intelligent Systems and Technology，2022，13（2）：1-41.
[4] ZHANG Y，JIN R，ZHOU Z H.Understanding bag-of-words model：a statistical framework[J].International Journal of Machine Learning and Cybernetics，2010，1（1）：43-52.
[5] JIANG Z，GAO B，HE Y，et al.Text classification using novel term weighting scheme-based improved TF-IDF for internet media reports[J].Mathematical Problems in Engineering，2021：1-30.
[6] XU F，PAN Z，XIA R.E-commerce product review sentiment classification based on a Na?Ve Bayes continuous learning framework[J].Information Processing & Management，2020，57（5）：102221.
[7] CERVANTES J，GARCIA-LAMONT F，RODRIGUEZ-MAZAHUA L，et al.A comprehensive survey on support vector machine classification：applications，challenges and trends[J].Neurocomputing，2020，408：189-215.
[8] SPEISER J L，MILLER M E，TOOZE J，et al.A comparison of random forest variable selection methods for classification prediction modeling[J].Expert Systems with Applications，2019，134：93-101.
[9] LIU P F，QIU X P，HUANG X J.Recurrent neural network for text classification[C]//Proceedings of the 25th International Joint Conference on Artificial Intelligence，2016：2873-2879.
[10] KIM Y.Convolutional neural networks for sentence classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing（EMNLP），2014：1746-1751.
[11] YAO L，MAO C S，LUO Y.Graph convolutional networks for text classification[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence，2019：7370-7377.
[12] DING K，WANG J L，LI J D，et al.Be more with less：hypergraph attention networks for inductive text classification[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing （EMNLP），2020：4927-4936.
[13] DEVLIN J，CHANG M W，LEE K，et al.BERT：pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics：Human Language Technologies，Volume 1（Long and Short Papers），Minneapolis，2019：4171-4186.
[14] FENG Y F，YOU H X，ZHANG Z Z，et al.Hypergraph neural networks[C]//Proceedings of the AAAI Conference on Artificial Intelligence，2019：3558-3565.
[15] VASWANI A，SHAZEER N，PARMAR N，et al.Attention is all you need[C]//Proceedings of the 31th International Conference on Neural Information Processing Systems，2017：6000-6010.
[16] MIKOLOV T，CHEN K，CORRADO G，et al.Efficient estimation of word representations in vector space[J].arXiv：1301.3781，2013.
[17] ABU ALFEILAT H A，HASSANAT A B A，LASASSMEH O，et al.Effects of distance measure choice on k-nearest neighbor classifier performance：a review[J].Big Data，2019，7（4）：221-248.
[18] 滕金保，孔韦韦，田乔鑫，等.基于CNN和LSTM的多通道注意力机制文本分类模型[J].计算机工程与应用，2021，57（23）：154-162.
TENG J B，KONG W W，TIAN Q X，et al.Multi-channel attention mechanism text classification model based on CNN and LSTM[J].Computer Engineering and Applications，2021，57（23）：154-162.
[19] CHENG J，DONG L，LAPATA M.Long short-term memory-networks for machine reading[J].arXiv：1601.06733，2016.
[20] PETERS M E，NEUMANN M，LYYER M，et al.Deep contextualized word representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics：Human Language Technologies，Volume 1（Long Papers），New Orleans，2018：2227-2237.
[21] 檀莹莹，王俊丽，张超波.基于图卷积神经网络的文本分类方法研究综述[J].计算机科学，2022，49（8）：205-216.
TAN Y Y，WANG J L，ZHANG C B.Review of text classification methods based on graph convolutional network[J].Computer Science，2022，49（8）：205-216.
[22] 申艳光，贾耀清.基于词共现与图卷积的文本分类方法[J].计算机工程与应用，2021，57（11）：173-178.
SHEN Y G，JIA Y Q.Text categorization method based on word co-occurrence and graph convolution[J].Computer Engineering and Applications，2021，57（11）：173-178.
[23] ZHANG C，ZHU H，PENG X Y，et al.Hierarchical information matters：text classification via tree based graph neural network[C]//Proceedings of the 29th International Conference on Computational Linguistics，Gyeongju，2022：950-959.
[24] DAI Y，SHOU L J，GONG M，et al.Graph fusion network for text classification[J].Knowledge-Based Systems，2022，236：107659.
[25] YANG Y T，MIAO R，WANG Y L，et al.Contrastive graph convolutional networks with adaptive augmentation for text classification[J].Information Processing & Management，2022，59（4）：102946.
[26] PENNINGTON J，SOCHER R，MANNIG C D.Glove：global vector for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing.Stroudsburg：ACL，2014：1532-1543.
[27] LIU H，CHEN G，LI P，et al.Multi-label text classification via joint learning from label embedding and label correlation[J].Neurocomputing，2021，460：385-398.
[28] TANG J，QU M，MEI Q.PTE：predictive text embedding through large-scale heterogeneous text network[C]//Proceeding of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining，2015：1165-1174.
[29] JOULIN A，GRAVE E，BOJANOWSKI P，et al.Bag of tricks for efficient text classification[C]//Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics.Stroudsburg：ACL，2017：427-431.
[30] SHEN D，WANG G，WANG W，et al.Baseline needs more love：on simple word-embedding-based models and associated pooling mechanisms[C]//Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics，2018：440-450.
[31] WANG G，LI C，WANG W，et al.Joint embedding of words and labels for text classification[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics（Volume 1：Long Papers），2018：2321-2331.
[32] ZHU H，KONIUSZ P.Simple spectral graph convolution[C]//International Conference on Learning Representation，2021：151-163.