计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (14): 86-93.DOI: 10.3778/j.issn.1002-8331.2203-0287

• 模式识别与人工智能 • 上一篇    下一篇

基于时空信息增强的科技论文主题趋势预测

郑长伟,薛哲,梁美玉,杜军平,寇菲菲   

  1. 北京邮电大学 计算机学院(国家示范性软件学院) 智能通信软件与多媒体北京市重点实验室,北京100876
  • 出版日期:2023-07-15 发布日期:2023-07-15

Topic Trend Prediction of Scientific Papers Based on Spatiotemporal Information Enhancement

ZHENG Changwei, XUE Zhe, LIANG Meiyu, DU Junping, KOU Feifei   

  1. Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia, School of Computer Science, Beijing University of Posts and Telecommunications, Beijing 100876, China
  • Online:2023-07-15 Published:2023-07-15

摘要: 近年来,随着社会对科研投入的增大,各种领域的研究成果数量显著增加,准确有效地预测未来研究主题的趋势可以帮助科研人员发现未来研究热点,各种领域的研究成果数量显著增加,准确有效地预测未来研究主题的趋势可以帮助科研人员发现未来研究热点。然而由于各个研究主题之间关联逐渐紧密,大量研究主题之间存在一定的依赖关系,孤立地看待单个研究主题并采用传统的序列问题处理方法,无法有效地挖掘这些研究主题之间的空间依赖。为了同时捕获研究主题之间的空间依赖以及时间变化,提出了一种基于时空信息增强的科技论文主题趋势预测模型,该模型结合了图卷积神经网络(GCN)和时间卷积网络(TCN),具体来说,GCN用于学习研究主题的空间表示,并利用空间依赖加强空间特征,TCN用于学习研究主题趋势的动态变化,并根据时间距离计算加权损失进行优化。在论文数据集以及公开数据集上与当前主流的序列预测模型以及类似的时空模型进行了对比,实验结果表明,在研究主题预测任务以及其他动态图任务中,该模型可以有效捕获时空关系并且优于当前最新的基准算法。

关键词: 科技数据预测, 图神经网络, 动态图学习, 膨胀卷积, 时间序列预测

Abstract: In recent years, with the increase of social investment in scientific research, the number of research results in various fields has increased significantly. Accurately and effectively predicting the trends of future research topics can help researchers discover future research hotspots. However, due to the increasingly close correlation between various research themes, there is a certain dependency relationship between a large number of research themes. Viewing a single research theme in isolation and using traditional sequence problem processing methods cannot effectively explore the spatial dependencies between these research themes. To simultaneously capture the spatial dependencies and temporal changes between research topics, a spatiotemporal convolutional network is proposed. The network combines a graph convolutional neural network(GCN) and temporal convolutional network(TCN), specifically, GCNs are used to learn the spatial dependencies of research topics and use space dependence to strengthen spatial characteristics. TCN is used to learn the dynamics of research topics’ trends. Optimization is based on the calculation of weighted losses based on time distance. Compared with the current mainstream sequence prediction models and similar spatiotemporal models on the paper datasets, experimental results show that, in research topic prediction tasks, the model can effectively capture spatiotemporal relationships and the predictions outperform state-of-art baselines.

Key words: science data forecasting, graph neural network, dynamic graph learning, dilated convolution, time series forecasting