Computer Engineering and Applications ›› 2023, Vol. 59 ›› Issue (4): 89-96.DOI: 10.3778/j.issn.1002-8331.2108-0342

• Pattern Recognition and Artificial Intelligence • Previous Articles     Next Articles

News Vector Representation Method Based on Semantic Relevance of Title and Body

LIAN Xiaoying, XUE Yuanhai, LIU Yue, SHEN Huawei   

  1. 1.Data Intelligence System Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
    2.University of Chinese Academy of Sciences, Beijing 101408, China
  • Online:2023-02-15 Published:2023-02-15

标题与正文语义融合的新闻向量表示方法

连晓颖,薛源海,刘悦,沈华伟   

  1. 1.中国科学院 计算技术研究所 数据智能系统研究中心,北京 100190
    2.中国科学院大学,北京 101408

Abstract: Aiming at the problems of large length of news body and complex semantic information of the news, this paper proposes a news vector representation method based on the semantic relevance of the title and the body(NRTA model). Take news title as queries, excavate supplementary information of news title from multiple regions of the news body, pay attention to the semantics of the former text as well as the semantics of the latter text, and reduce the bias in the understanding of the news. Experiments on two real news recommendation datasets MIND and Adressa show that compared with the baseline method, the improvement of this method in each evaluation is between 0.86% and 3.95%, which verifies the importance of the latter semantics of the news body and further enriches the news vector representation.

Key words: news recommendation, semantic information of news body, news vector representation, attention mechanism

摘要: 针对新闻正文文本长度大、语义信息复杂的问题,提出了一种标题与正文语义融合的新闻向量表示方法(NRTA模型)。以新闻标题为查询,从正文的多个区域中挖掘标题的补充信息,关注前文语义的同时也关注后文语义,减少对新闻正文理解的偏差。在两个真实新闻推荐数据集MIND和Adressa上的实验表明,该方法较基线方法在各评价指标上的提升幅度在0.86%到3.95%之间,验证了正文后文语义信息的重要性,进一步丰富了新闻向量表示。

关键词: 新闻推荐, 正文语义信息, 向量表示, 注意力机制