[1] SCARSELLI F, GORI M, TSOI A C, et al. The graph neural work model[J]. IEEE Transactions on Networks, 2008, 20(1):61-80.
[2] BRUNA J, ZAREMBA W, SZLAM A, et al. Spectral networks and locally connected networks on graphs[J]. arXiv:1312. 6203, 2013.
[3] NITISH S K, RICHARD S. Improving generalization performance by switching from Adam to SGD[J]. arXiv:1712.07628, 2017.
[4] 刘菡, 王英男, 李新利, 等. 基于互信息-图卷积神经网络的燃煤电站NOx排放预测[J]. 中国电机工程学报, 2022, 42(3):1052-1059.
LIU H, WANG Y H, LI X L, et al. NOx emission prediction of coal-fired power stations based on the mutual information graph convolutional neural network[J]. Proceedings of the CSEE, 2022, 42(3):1052-1059.
[5] 富坤, 禚佳明, 郭云朋, 等. 自适应融合邻域聚合和邻域交互的图卷积网络[J]. 计算机科学与探索, 2023, 17(2): 453-466.
FU K, ZHOU J M, GUO Y P, et al. A graph convolutional network of adaptive fusion neighborhood aggregation and neighborhood interactions[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(2): 453-466.
[6] ROBBINS H, MONRO S. A stochastic approximation method[M]. New York: Springer, 1985.
[7] DOZAT T. Incorporating nesterov momentum into adam[C]//Proceedings of the 4th International Conference on Learning Representations, 2016.
[8] DUCHI J, HAZAN E, SINGER Y. Adaptive subgradient methods for online learning and stochastic optimization[J]. Journal of Machine Learning Research, 2011, 12: 2121-2159.
[9] ZEILER D M. ADADELTA: an adaptive learning ate method [J]. arXiv:1212.5701, 2012.
[10] HINTON G, SRIVASTAVA N, SWERSKY K. Neural networks for machine learning lecture 6a overview of mini-batch gradient descent[J]. Cited on, 2012, 14(8): 2.
[11] KINGMA D, BA J. Adam:a method for stochastic optimization[C]//Proceedings of the 3rd International Conference on Learning Representations, 2015: 1-13.
[12] REDDI S J, KALE S, KUMAR S. On the convergence of Adam and beyond[C]//Proceeding of the 6th International Conference on Learning Representations, 2018.
[13] NITISH S K, DHEEVATSA M, NOCEDAL J, et al. On large-batch training for deep learning: generalization gap and sharp minima[J]. arXiv:1609.04836, 2016.
[14] GHADIMI M, SHAHRIAR K, JALALIFAR H. Optimization of the fully grouted rock bolts for load transfer enhancement[J]. International Journal of Mining Science and Technology, 2015, 25(5): 707-712.
[15] DUBEY S R, CHAKRABORTY S, ROY S K, et al. diffGrad: an optimization method for convolutional neural networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2020, 11(31): 4500-4511
[16] 谭涛. 基于卷积神经网络的随机梯度下降优化算法研究[D]. 重庆: 西南大学, 2020.
TAN T. The optimization algorithm research of stochastic gradient descent based on convolutional neural network[D]. Chongqing: Southwest University, 2020.
[17] 史加荣, 王丹, 尚凡华, 等. 随机梯度下降算法研究进展[J]. 自动化学报, 2021, 47(9):2103-2119.
SHI J R, WANG D, SHANG F H, et al. Progress in the stochastic gradient descent algorithm[J]. Acta Automatica Sinica, 2021, 47(9): 2103-2119.
[18] 杨启伦, 张续莹, 李含超, 等. 基于动量梯度下降的自适应干扰对消算法[J]. 电子信息对抗技术, 2022, 37(2): 30-32.
YANG Q L, ZHANG X Y, LI H C, et al. Adaptive interference-extinction algorithm based on momentum gradient descent[J]. Electronic Information Warfare Technology, 2022, 37(2):30-32.
[19] KIP F, WELLINNG M. Semi-supervised classification with graph convolutional networks[C]//Proceedings of the International Conference on Learning Representations, 2017.
[20] 李明, 来国红, 常晏鸣, 等. 深度学习算法中不同优化器的性能分析[J]. 信息技术与信息化, 2022: 206-209.
LI M, LAI G H, CHANG Y M, et al. Performance analysis of different optimizers in deep learning algorithms[J]. Information Technology and Informatization, 2022: 206-209.
[21] LANCTOT M, BRITTAIN J, FOERSTER J, et al. Neural network dynamics for model-based deep reinforcement learning with model-free fine-tuning[C]//Proceeding of the 2018 IEEE International Conference on Robotics and Automation, 2017: 3649-3658.
[22] BOTEV A, LEVER G, BARBER D. Nesterov’s accelerated gradient and momentum as approximations to regularised update descent[C]//Proceedings of the 30th International Joint Conference on Neural Networks, 2017: 1899-1903. |