计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (22): 132-141.DOI: 10.3778/j.issn.1002-8331.2104-0399

• 模式识别与人工智能 • 上一篇    下一篇

基于在线Biterm主题模型的舆情新闻事件跟踪

马子娟,岳昆,段亮,赵天资   

  1. 云南大学 信息学院,昆明 650500
  • 出版日期:2022-11-15 发布日期:2022-11-15

Tracking Events of Public Opinion News Based on Online Biterm Topic Model

MA Zijuan, YUE Kun, DUAN Liang, ZHAO Tianzi   

  1. School of Information Science & Engineering, Yunnan University, Kunming 650500, China
  • Online:2022-11-15 Published:2022-11-15

摘要: 舆情新闻事件跟踪,是舆情监控、热点分析、政策制定等研究和应用的重要基础。针对舆情新闻的稀疏性、敏感性、易演化性、次生性等特点,基于在线Biterm主题模型(online Biterm topic model,DBTM),通过随机坍缩变分贝叶斯(stochastic collapsed variational Bayesian inference,SCVB0)算法更新参数,提出面向舆情新闻事件监控的主题模型MBTM(monitor Biterm topic model),利用该模型检测初期事件主题,跟踪后续新闻所属的主题。为了对存在关联关系的事件进行串联,进一步给出事件线索的概念,分别从主题层面和语义层面度量线索关联度,进而针对新闻事件主题生成事件线索。实验结果表明,MBTM模型在大多数指标上均优于OBTM等模型,验证了该方法的有效性和高效性。

关键词: 舆情新闻事件, 事件跟踪, 事件线索, 在线Biterm主题模型

Abstract: Event tracking of public opinion news is critical for the research and application of public opinion monitoring, heat analysis and policy making. In terms of the features of public opinion news, such as the sparsity and sensibility of features, evolvability and secondary topics, the novel monitor Biterm topic model(MBTM) for public opinion news events is proposed in this paper, based on the online Biterm topic model(OBTM) and updating the parameters by the stochastic collapsed variational Bayesian inference(SCVB0) algorithm. Firstly, the MBTM is used to detect the original topics of events and track the topics of subsequent news. Further, this paper gives the concept of event threading and measures the correlation of event threading from topic and semantic levels. Finally, this paper generates event threading based on the topic of news events to concatenate the events that are related. Experimental results show that MBTM is superior to OBTM on most indicators, which verifies the effectiveness and efficiency of the proposed method.

Key words: public opinion news events, event tracking, event threading, online Biterm topic model