计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (17): 275-285.DOI: 10.3778/j.issn.1002-8331.2211-0262

• 工程与应用 • 上一篇    下一篇

基于图卷积神经网络的个人信用风险预测

梁龙跃,王浩竹   

  1. 1.贵州大学 经济学院,贵阳 550025
    2.贵州大学 经济学院,马克思主义经济学发展与应用研究中心,贵阳 550025
    3.南开大学 经济学院,天津 300071
  • 出版日期:2023-09-01 发布日期:2023-09-01

Personal Credit Risk Prediction Based on Graph Convolutional Network

LIANG Longyue, WANG Haozhu   

  1. 1.School of Economics, Guizhou University, Guiyang 550025, China
    2.Center for the Development and Application of Marxist Economics, Guizhou University, Guiyang 550025, China
    3.School of Economics, Nankai University, Tianjin 300071, China
  • Online:2023-09-01 Published:2023-09-01

摘要: 信用风险的评估与管理是金融机构的重要任务之一。为探究个人征信样本与违约样本间存在相似性时,能否体现个人征信的信用风险,并对预测其违约做出贡献。基于LendingClub 2020年第1~3季度贷款数据,利用GAMI-net从高维征信数据中筛选样本违约特征,并通过曼哈顿距离构建样本相似性网络,以反映样本整体间的征信相似性,并建立基于图卷积神经网络的信用风险预测模型。研究发现,个人征信相似性能体现信用风险,并对预测产生显著正面贡献。基于图卷积神经网络的风险预测模型在实验中的AUC值为81.60%,准确率为73.71%,相较于所有基准对比模型均大幅提升,表明考虑了样本相似性网络的图神经网络模型在信用风险预测精度上远优于未考虑样本相似性的机器学习模型。此外,所提供的特征筛选及样本间网络构建方法也为大数据智能风控提供了一定的借鉴意义。

关键词: 个人信用风险, 图卷积神经网络, 相似性网络

Abstract: Credit risk assessment and management is one of the important tasks of financial institutions. In order to explore whether the personal credit investigation data and default data are similar, can reflect the credit risk and make contributions to the prediction of the default, it takes the loan data of LendingClub from the first quarter to the third in 2020 as an example. By using GAMI-net to select customers’ default features from high-dimensional data, and constructing a loan customers similarity network through Manhattan distance to reflect the complex connection between customers, a personal loan default prediction model based on graph convolutional neural network is constructed. The AUC value of the prediction model based on the graph convolutional neural network in the experiment is 81.60%, and the accuracy rate is 73.31%, which is greatly improved compared with all the benchmark models. The graph neural network model considering the customer network is more superior to the machine learning models which does not consider the customer network in terms of loan default prediction accuracy. In addition, the methods of feature selection and network construction provided in the paper also offer a certain reference for the intelligent risk control methods in big data era.

Key words: credit risk prediction, graph convolutional neural network, credit similarity network