计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (15): 153-161.DOI: 10.3778/j.issn.1002-8331.2101-0001

• 模式识别与人工智能 • 上一篇    下一篇

基于双分支网络联合训练的虚假新闻检测

郭铃霓,黄舰,吴兴财,杨振国,刘文印   

  1. 1.广东工业大学 计算机学院,广州 510006
    2.鹏城实验室网络空间安全研究中心,广东 深圳 518000
  • 出版日期:2022-08-01 发布日期:2022-08-01

Fake News Detection Based on Joint Training Two-Branch Network

GUO Lingni, HUANG Jian, WU Xingcai, YANG Zhenguo, LIU Wenyin   

  1. 1.School of Computer, Guangdong University of Technology, Guangzhou 510006, China
    2.Cyberspace Security Research Center, Peng Cheng Laboratory, Shenzhen, Guangdong 518000, China
  • Online:2022-08-01 Published:2022-08-01

摘要: 虚假新闻在社交媒体上的广泛传播,给社会带来了不同程度的负面影响。针对虚假新闻早期检测任务中,社交上下文信息不充分的问题,提出一种基于双分支网络联合训练的虚假新闻检测模型。该模型由最大池化网络分支(max pooling branch,MPB)和广义均值池化网络分支(generalized mean pooling branch,GPB)组成。MPB采用卷积神经网络对新闻文章进行文本特征提取,GPB引入了可训练的池化层,学习新闻文章潜在的语义特征。同时,在每个分支网络中,对新闻标题和正文之间进行语义关联性度量。最终,对两个分支网络联合训练后的结果进行决策融合,判断新闻的真实性。实验结果表明,提出的模型在准确率、召回率、F1值评测指标上均优于基线模型,F1值达到94.1%,比最优的基线模型提升了6.4个百分点。

关键词: 虚假新闻早期检测, 联合训练, 双分支网络, 语义关联性度量

Abstract: The wide spread dissemination of fake news on social media has brought negative impact to the society. The key problem in detecting fake news at an early stage is that the news is just published lacking social context. To this end, the joint training two-branch network is proposed for detecting fake news. Specifically, the two-branch network is consisted of two branches, i.e., the max pooling network branch(MPB) and the generalized mean pooling network branch(GPB). MPB uses convolutional neural network to extract textual features of news articles, and GPB introduces a trainable pooling layer to learn the latent semantic features of news articles. Meanwhile, the semantic relevance between news title and the body text is measured in each branch. Finally, the results from the two branches are fused to judge the authenticity of news. The experimental results show that the proposed model outperforms the baseline models in terms of the evaluation metrics of accuracy, recall and F1-score, and achieves 94.1% on F1-score, which is 6.4 percentage points higher than other baselines.

Key words: fake news early detection, joint learning, two-branch network, sematic correlation metric