Label-Conditional Neural Topic Model for Semantic Analysis of Short Texts

doi:10.3778/j.issn.1002-8331.2206-0328

Abstract

Abstract: Neural topic models in unsupervised machine learning methods have been widely used to automatically mine the text for latent semantics. However, the limited length of short text and the scarcity of information available for inference in the text makes it difficult for the model to correctly identify ambiguous words with insufficient context. Therefore, a label-conditional neural topic model for semantic analysis of short texts is proposed. The model adopts a variational auto-encoder architecture, which introduces the label information of the text as a semantic identifier at the topic category level on the topic distribution of the encoder output to guide the model to filter words that are not semantically relevant to the current topic, condense the semantics, and identify the exact word meanings of ambiguous words in the topic context to guide the model to infer discrete consistent topic. To address the data characteristics of statistically significant bias of topic semantic distribution during the application of short texts, PolyLoss is introduced in the model training process, and the imbalance of short text category distribution is modeled by adjusting Taylor polynomial coefficients. The experimental results show that the model can not only greatly improve the quality of short-text topic modeling, and generate coherent and diverse topics, but also effectively improve the performance of downstream tasks.

Key words: neural topic models, short texts, PolyLoss

摘要： 无监督机器学习方法中的神经主题模型已被广泛用于自动挖掘文本潜在语义。然而，短文本篇幅有限，文中可用于推断的信息匮乏，模型难以在上下文不充分的情况下正确识别歧义词。为此，提出了一种面向短文本语义分析的标签条件神经主题模型，模型采用变分自编码器架构，在编码器输出的主题分布上引入文本的标签信息，作为主题类别级的语义标识符指导模型过滤与当前主题语义不相关的词、凝练语义并辨识歧义词在主题语境下的准确词义，引导模型推断离散一致的主题。针对短文本应用过程中主题语义分布统计显著有偏的数据特点，在模型训练过程中引入泰勒损失，通过调整泰勒多项式系数建模短文本类别分布不平衡。实验结果表明，该模型不仅能够极大提高短文本主题建模的质量，生成连贯且多样的主题，而且能有效提升下游任务性能。

关键词: 神经主题模型, 短文本, 泰勒损失

WANG Yuan, YAN Yanling, XU Maoling, HU Peng, ZHAO Tingting, YANG Jucheng. Label-Conditional Neural Topic Model for Semantic Analysis of Short Texts[J]. Computer Engineering and Applications, 2023, 59(11): 80-87.

王嫄, 鄢艳玲, 徐茂玲, 胡鹏, 赵婷婷, 杨巨成. 面向短文本语义分析的标签条件神经主题模型[J]. 计算机工程与应用, 2023, 59(11): 80-87.

References

[1] LI C，WANG H，ZHANG Z，et al.Topic modeling for short texts with auxiliary word embeddings[C]//Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval，Pisa，Italy，July 17-21，2016：165-174.
[2] 杨兴锐，赵寿为，王育林，等.结合自注意力和残差的BiLSTM_CNN文本分类模型[J].计算机工程与应用，2022，58（13）：172-180.
YANG X R，ZHAO S W，WANG Y L，et al.BiLSTM_ CNN classification model based on self-attention and residual network[J].Computer Engineering and Applications，2022，58（13）：172-180.
[3] ZHENG C，XIONG D K，LIU Q Q.The short text classification method based on CHI feature selection and lda topic model[J].Computer Knowledge and Technology，2014，14：3182-3185.
[4] YANG G，WEN D，CHEN N S，et al.A novel contextual topic model for multi-document summarization[J].Expert Systems with Applications，2015，42（3）：1340-1352.
[5] JIANG S，QIAN X，SHEN J，et al.Author topic model-based collaborative filtering for personalized POI recommendations[J].IEEE Transactions on Multimedia，2015，17（6）：907-918.
[6] 冀欣婷，诺明花.一种融合标签和知识图谱的推荐方法[J].中文信息学报，2022，36（6）：125-134.
JI X T，NUO M H.A recommendation method combining tag and knowledge graph[J].Journal of Chinese Information Processing，2022，36（6）：125-134.
[7] 王宝亮，潘文采.基于知识图谱的双端邻居信息融合推荐算法[J].计算机科学与探索，2022，16（6）：1354-1361.
WANG B L，PAN W C.Two-terminal neighbor information fusion recommendation algorithm based on knowledge graph[J].Journal of Frontiers of Computer Science and Technology，2022，16（6）：1354-1361.
[8] CAMBRIA E，OLSHER D，RAJAGOPAL D.SenticNet3：a common and common-sense knowledge base for cognition-driven sentiment analysis[C]//Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence，Québec，Canada，July 27-31，2014.Menlo Park：AAAI Press，2014：1515-1521.
[9] 曾雪强，华鑫，刘平生，等.基于情感轮和情感词典的文本情感分布标记增强方法[J].计算机学报，2021（6）：1080-1094.
ZENG X Q，HUA X，LIU P S，et al.Emotion wheel and lexcion based text emotion distribution label enhancement method[J].Chinese Journal of Computers，2021（6）：1080-1094.
[10] MIAO Y，YU L，BLUNSOM P.Neural variational inference for text processing[C]//Proceedings of the 33rd International Conference on Machine Learning，New York，United States，June 19-24，2016：1727-1736.
[11] SRIVASTAVA A，SUTTON C.Autoencoding variational inference for topic models[J].arXiv：1703.01488，2017.
[12] ZHU Q，FENG Z，LI X.Graphbtm：graph enhanced autoencoded variational inference for biterm topic model[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing，Brussels，Belgium，2018：4663-4672.
[13] LIN L，JIANG H，RAO Y.Copula guided neural topic modelling for short texts[C]//Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval，Virtual Event China，July 25-30，2020：1773-1776.
[14] FENG J，ZHANG Z，DING C，et al.Context reinforced neural topic modeling over short texts[J].Information Sciences，2022，607：79-91.
[15] GAO W，PENG M，WANG H，et al.Incorporating word embeddings into topic modeling of short text[J].Knowledge and Information Systems，2019，61（2）：1123-1145.
[16] 王仲远.短文本数据理解[M]//大数据管理丛书.北京：机械工业出版社，2017.
WANG Z Y.Understanding of short text data[M]//Big data management.Beijing：Machinery Industry Press，2017.
[17] KINGMA D P，WELLING M.Auto-encoding variational bayes[J].arXiv：1312.6114，2013.
[18] LENG Z，TAN M，LIU C，et al.PolyLoss：a polynomial expansion perspective of classification loss functions[J].arXiv：2204.12511，2022.
[19] BLEI D M.Probabilistic topic models[J].Communications of the ACM，2012，55（4）：77-84.
[20] BLEI D M，KUCUKELBIR A，MCAULIFFE J D.Variational inference：a review for statisticians[J].Journal of the American Statistical Association，2017，112（518）：859-877.
[21] HOFMANN T.Probabilistic latent semantic analysis[J].arXiv：1301.6705，2013.
[22] BLEI D M，NG A Y，JORDAN M I.Latent Dirichlet allocation[J].Journal of Machine Learning Research，2003，3：993-1022.
[23] PHAN X H，NGUYEN L M，HORIGUCHI S.Learning to classify short and sparse text & web with hidden topics from large-scale data collections[C]//Proceedings of the 17th International World Wide Web Conference，Beijing，China，April 21-25，2008：91-100.
[24] YAN X，GUO J，LAN Y，et al.A biterm topic model for short texts[C]//Proceedings of the 22nd International Conference on World Wide Web，Rio de Janeiro，Brazil，May 13-17，2013：1445-1456.
[25] YIN J，WANG J.A Dirichlet multinomial mixture model-based approach for short text clustering[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining，New York，USA，August 24-27，2014：233-242.
[26] WU X，LI C，ZHU Y，et al.Short text topic modeling with topic distribution quantization and negative sampling decoder[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing，Virtual，Online，November 16-20，2020：1772-1782.
[27] MCAULIFFE J，BLEI D.Supervised topic models[C]//Proceedings of the 20th International Conference on Neural Information Processing Systems，Vancouver，Canada，December 3-6，2007：121-128.
[28] RAMAGE D，HALL D，NALLAPATI R，et al.Labeled LDA：a supervised topic model for credit attribution in multi-labeled corpora[C]//Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing，Singapore，August 6-7，2009：248-256.
[29] XU J，WANG P，TIAN G，et al.Short text clustering via convolutional neural networks[C]//Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing，VS 2015 at the Conference of the North American Chapter of the Association for Computational Linguistics：Human Language Technologies，NAACL-HLT 2015 Workshop on Vector Space Modeling for Natural Language Processing，Denver，CO，United States，June 5，2015：62-69.
[30] ZHANG X，ZHAO J，LECUN Y.Character-level convolutional networks for text classification[C]//Advances in Neural Information Processing Systems，2015：649-657.
[31] VITALE D，FERRAGINA P，SCAIELLA U.Classification of short texts by deploying topical annotations[C]//Proceedings of the 34th European Conference on Information Retrieval，Barcelona，Spain，April 1-5，2012：376-387.
[32] RODER M，BOTH A，HINNEBURG A.Exploring the space of topic coherence measures[C]//Proceedings of the Eighth ACM International Conference on Web Search and Data Mining，Shanghai，China，February 2-6，2015：399-408.
[33] NAN F，DING R，NALLAPATI R，et al.Topic modeling with Wasserstein autoencoders[J].arXiv：1907.12374，2019.