Computer Engineering and Applications ›› 2024, Vol. 60 ›› Issue (18): 189-197.DOI: 10.3778/j.issn.1002-8331.2306-0324

• Pattern Recognition and Artificial Intelligence • Previous Articles     Next Articles

Pre-Training Model of Public Opinion Event Vector

WANG Nan, TAN Shuru, XIE Xiaolan, LI Hairong   

  1. 1.College of Management Science and Information Engineering, Jilin University of Finance and Economics, Changchun 130117, China
    2.School of Information Science and Engineering, Guilin University of Technology, Guilin, Guangxi 541006, China
    3.School of Information Engineering, Xinjiang Institute of Technology, Aksu, Xinjiang 843100, China
  • Online:2024-09-15 Published:2024-09-13

舆情事件向量预训练模型

王楠,谭舒孺,谢晓兰,李海荣   

  1. 1.吉林财经大学 管理科学与信息工程学院,长春 130117
    2.桂林理工大学 信息科学与工程学院,广西 桂林 541006
    3.新疆理工学院 信息工程学院,新疆 阿克苏 843100

Abstract: In current research on public opinion prediction, event representation has a certain degree of subjectivity and stationarity, and does not fully express the dynamic and evolutionary nature of event evolution. Many features need to be obtained through analyzing the complete process of event development, resulting in the constructed prediction model not being able to achieve the warning purpose before the occurrence of public opinion phenomena. This paper constructs an event pre-training model to automatically generate event feature vector based on comments data, and it is used to train downstream public opinion reversal prediction models. By combining subjective comments and temporal information of events, the problem of generating abstract event feature vectors is transformed into a natural language preprocessing problem by constructing comment words, event word vectors, event words, and event sentences. Based on the Transformer structure, a new modeling method is proposed to achieve automatic generation of event feature vectors and prediction of public opinion reversal. When the model proposed in this paper is used for downstream tasks of predicting public opinion reversal, the prediction rate of reversal events in the test set reaches 100%, achieving the goal of predicting reversal phenomena before the reversal point. At the same time, the prediction model can also accurately predict the generation of event sentences for the next day. In the n-fold cross validation of the test set, only 11% of the events have showed prediction errors, providing data and methodological basis for studying issues related to public opinion evolution.

Key words: public opinion reversal prediction, event feature pre-training, public opinion evolution, natural language processing, Transformer

摘要: 目前舆情预测研究中,事件表示具有一定的主观性和静态性,没有充分表达出事件演化的动态性和演化性,很多特征需要通过分析事件发展的完整过程得到,导致构建的预测模型并不能实现舆情现象发生前的预警目的。构建了事件预训练模型,实现基于评论数据的事件特征向量自动生成,并用于训练下游舆情反转预测模型。结合事件的主观评论与时序信息,通过构造评论词、事件词向量、事件词、事件句,将抽象的事件特征向量生成问题转换为自然语言预处理问题,基于Transformer结构提出了一种新的建模方式,实现事件特征向量自动生成及舆情反转预测。提出的模型用于舆情反转预测下游任务时,在测试集中对反转事件的预测率达到100%,实现了反转点之前预测出反转现象的目的。同时,该预测模型还可以较为准确地预测生成第二天的事件句,在对测试集的[n]折交叉验证中仅有11%的事件出现了预测误差,为研究舆情演化相关问题提供数据和方法基础。

关键词: 舆情反转预测, 事件特征预训练, 舆情演化, 自然语言处理, Transformer