计算机工程与应用 ›› 2021, Vol. 57 ›› Issue (21): 109-115.DOI: 10.3778/j.issn.1002-8331.2101-0461

• 理论与研发 • 上一篇    下一篇

面向预测的长短时神经网络记忆增强机制

吴明慧,侯凌燕,王超   

  1. 1.北京信息科技大学 计算机开放系统实验室,北京 100101
    2.北京材料基因工程高精尖中心,北京 100101
  • 出版日期:2021-11-01 发布日期:2021-11-04

Improved Mechanism of Prediction-Oriented Long Short-Term Memory Neural Network

WU Minghui, HOU Lingyan, WANG Chao   

  1. 1.Computer Open Systems Laboratory, Beijing Information Science and Technology University, Beijing 100101, China
    2.Beijing Advanced Innovation Center for Materials Genome Engineering, Beijing 100101, China
  • Online:2021-11-01 Published:2021-11-04

摘要:

基于时序数据建模的长短时神经网络(LSTM)可用于预测类问题。现实场景中,LSTM预测精度往往与输入序列长度相关,有效的历史信息会被新输入的数据淹没。针对此问题,提出在LSTM节点中构建强化门实现对遗忘信息的提取,并与记忆信息按比例选取、融合、输入记忆单元,增加学习过程中的梯度传导能力,使网络对相对较远的信息保持敏感以提升记忆能力。实验采用工业故障数据,当序列长度超过100时,具有强化门机制的改进模型预测误差低于其他LSTM模型。预测精度的差距随序列增加而增大,当序列长度增至200时,改进模型的预测误差(RMSE/MAE)较原模型分别降低了26.98%与35.85%。

关键词: 长短时神经网络, 时间序列预测模型, 记忆增强机制, 深度学习

Abstract:

Based on time series data modeling, Long Short-Term Memory neural network(LSTM) can be used to solve prediction-oriented problems. In real scenarios, the prediction accuracy of LSTM is often related to the length of the input sequence, and valid historical information will be overwhelmed by the newly input data. To solve this problem, it is proposed to construct a reinforcement gate in the LSTM node to extract the forgotten information, and to select, merge, and enter the memory unit in proportion to the memory information to increase the gradient conduction ability in the learning process, so that the network can deal with relatively distant information, stay sensitive to improve memory ability. The experiment uses industrial fault data. When the sequence length exceeds 100, the prediction error of the improved model with enhanced gate mechanism is lower than other LSTM models. The gap in prediction accuracy increases with the increase of the sequence. When the sequence length increases to 200, the prediction error(RMSE/MAE) of the improved model is reduced by 26.98% and 35.85% respectively compared with the original model.

Key words: long short-term memory neural network, time series prediction model, memory enhaunce mechanism, deep learning