Computer Engineering and Applications ›› 2019, Vol. 55 ›› Issue (13): 20-27.DOI: 10.3778/j.issn.1002-8331.1901-0246

Previous Articles     Next Articles

Attention Mechanism-Based CNN-LSTM Model and Its Application

LI Mei1,2, NING Dejun1, GUO Jiacheng1,2   

  1. 1.Shanghai Advanced Research Institute, Chinese Academy of Sciences, Shanghai 200120, China
    2.University of Chinese Academy of Sciences, Beijing 100049, China
  • Online:2019-07-01 Published:2019-07-01

基于注意力机制的CNN-LSTM模型及其应用

李  梅1,2,宁德军1,郭佳程1,2   

  1. 1.中国科学院 上海高等研究院,上海 200120
    2.中国科学院大学,北京 100049

Abstract: Time series have temporal property, and the characteristics of its short sequences are different in importance. Aiming at the characteristics of time series, a neural network prediction model based on Convolution Neural Network(CNN) and Long Short-Term Memory(LSTM) is proposed, which combines coarse and fine grain features to achieve accurate time series prediction. The model consists of two parts. CNN based on attention mechanism adds attention branch to standard CNN network to extract important fine-grained features. The back end is LSTM, which extracts the coarse-grained features of the hidden time series from fine-grained features. Experiments on real cogeneration heat load dataset demonstrate that the model is better than the autoregressive integrated moving average, support vector regression, CNN and LSTM models. Compared with the pre-determined method currently used by enterprises, the Mean Absolute Scaled Error(MASE) and Root Mean Square Error(RMSE) have been increased by 89.64% and 61.73% respectively.

Key words: attention mechanism, Convolution Neural Network(CNN), Long Short-Term Memory Network(LSTM), time series, load forecasting

摘要: 时序数据存在时序性,并且其短序列的特征存在重要程度差异性。针对时序数据特征,提出一种基于注意力机制的卷积神经网络(CNN)联合长短期记忆网络(LSTM)的神经网络预测模型,融合粗细粒度特征实现准确的时间序列预测。该模型由两部分构成:基于注意力机制的CNN,在标准CNN网络上增加注意力分支,以抽取重要细粒度特征;后端为LSTM,由细粒度特征抽取潜藏时序规律的粗粒度特征。在真实的热电联产供热数据上的实验表明,该模型比差分整合移动平均自回归、支持向量回归、CNN以及LSTM模型的预测效果更好,对比目前企业将预定量作为预测量的方法,预测缩放误差平均值(MASE)与均方根误差(RMSE)指标分别提升了89.64%和61.73%。

关键词: 注意力机制, 卷积神经网络(CNN), 长短期记忆网络(LSTM), 时间序列, 负荷预测