计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (14): 133-143.DOI: 10.3778/j.issn.1002-8331.2304-0250

• 模式识别与人工智能 • 上一篇    下一篇

动量余弦相似度梯度优化图卷积神经网络

闫建红,段运会   

  1. 1.太原师范学院 计算机科学与技术学院,山西 晋中 030619
    2.太原师范学院 数学与统计学院,山西 晋中 030619
  • 出版日期:2024-07-15 发布日期:2024-07-15

Graph Convolutional Neural Networks Optimized by Momentum Cosine Similarity Gradient

YAN Jianhong, DUAN Yunhui   

  1. 1.School of Computer Science and Technology, Taiyuan Normal University, Jinzhong, Shanxi 030619, China
    2.School of Mathematics and Statistics, Taiyuan Normal University, Jinzhong, Shanxi 030619, China
  • Online:2024-07-15 Published:2024-07-15

摘要: 传统梯度下降算法仅对历史梯度进行指数加权累加,没有利用梯度的局部变化,造成优化过程越过全局最优解,即使收敛到最优解也会在最优解附近震荡,其训练图卷积神经网络会造成收敛速度慢、测试准确度低。利用相邻两次梯度的余弦相似度,动态调整学习率,提出余弦相似度梯度下降(SimGrad)算法。为进一步提升图卷积神经网络训练的收敛速度和测试准确度,减少震荡,结合动量思想提出动量余弦相似度梯度下降(NSimGrad)算法。通过收敛性分析,证明SimGrad算法、NSimGrad算法都具有[OT]的遗憾界。在构建的三个非凸函数进行测试,并结合图卷积神经网络在四个数据集上进行实验,结果表明SimGrad算法保证了图卷积神经网络的收敛性,NSimGrad算法进一步提高图卷积神经网络训练的收敛速度和测试准确度,SimGrad、NSimGrad算法相较于Adam、Nadam具有更好的全局收敛性和优化能力。

关键词: 梯度下降类算法, 余弦相似度, 图卷积神经网络, 遗憾界, 全局收敛性

Abstract: The traditional gradient descent algorithm only uses the exponential weighted accumulation of historical gradients and does not take advantage of the local changes of gradients, which causes the optimization process to cross the global optimal solution. Even if it converges to the optimal solution, it will oscillate near the optimal solution. Using it to train graph convolutional neural network will result in slow convergence speed and low test accuracy. In this paper, the cosine similarity is used to dynamically adjust the learning rate and propose the cosine similarity gradient descent (SimGrad) algorithm. In order to further improve the convergence speed and test accuracy of the graph convolutional neural network training and reduce the oscillation, the momentum cosine similarity gradient descent (NSimGrad) algorithm is proposed combined with the momentum idea. The convergence analysis proves the regret bound of SimGrad algorithm and NSimGrad algorithm which is [OT]. Test on the three constructed non-convex functions and experiment on four datasets combined with the graph convolutional neural network. Experimental results show that SimGrad algorithm ensures the convergence of graph convolutional neural network, and NSimGrad algorithm further improves the convergence speed and test accuracy of graph convolutional neural network training. SimGrad and NSimGrad algorithms have better global convergence and optimization ability than Adam and Nadam.

Key words: gradient descent algorithm, cosine similarity, graph convolutional neural network, regret, global convergence