计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (19): 237-246.DOI: 10.3778/j.issn.1002-8331.2205-0561

• 网络、通信与安全 • 上一篇    下一篇

模块化决策森林的区块链交易欺诈检测模型

田红鹏,韦甜   

  1. 西安科技大学 计算机科学与技术学院,西安 710600
  • 出版日期:2023-10-01 发布日期:2023-10-01

Blockchain Transaction Fraud Detection Based on Modular Decision Forest

TIAN Hongpeng, WEI Tian   

  1. School of Computer Science and Technology, Xi’an University of Science and Technology, Xi’an 710600, China
  • Online:2023-10-01 Published:2023-10-01

摘要: 区块链技术广泛应用于数字资产和电子交易等行业,因此出现很多欺诈行为。为了有效控制比特币交易欺诈损失,现有欺诈检测方式主要有学习模型预测和规则匹配等,但该方式存在预测精确率不够高和欺诈者容易绕过规则的问题。针对上述问题,采用改进的去噪稀疏自编码器,以降低虚拟货币交易数据特征维度,继而结合“分而治之”的方法,提出模块化决策森林模型。模块化决策森林是基于峰值密度快速模糊聚类将数据分解为多组小数据,每组数据都将由一个决策树学习。根据隶属度确定模糊边界,边界模糊样本将添加一组决策树进行学习。对于分类难度仍较大的样本采用多次划分的策略,由父决策树与多个子决策树共同学习。在实验验证部分,分别采用数字图像数据集Optdigits、虚拟货币交易数据集Elliptic和Ethereum,验证模块化决策森林模型的性能,并与图神经网络、逻辑回归、随机森林等模型进行对比。实验结果表明,该模块化决策森林模型在精确率、召回率、F1-score均有大幅度提升。

关键词: 虚拟货币交易, 模块化决策森林, 去噪自编码器, 欺诈检测

Abstract: The blockchain technology is widely used in industries such as digital assets and electronic transactions, thus, there is a lot of frauds. To effectively control the fraud loss of bitcoin transactions, the existing fraud detection methods mainly include learning model prediction and rule matching, etc. However, this method is ineffective and fraudsters are easy to bypass the rules. In view of the above problems, firstly, it proposes an improved denoising sparse autoencoder to reduce the feature dimension of virtual currency transaction data, and then proposes a modular decision forest model combined with the method of“divide and conquer”. Modular decision forest is based on peak density fast fuzzy clustering to decompose data into multiple groups of small data, each group of data will be learned by a decision tree. Secondly, the fuzzy boundary is determined according to the membership degree, and a set of decision trees are added to the boundary fuzzy samples for learning. For the samples which are still difficult to classify, the strategy of multiple partitions is adopted, and the parent decision tree and multiple sub-decision trees are learned together. Finally, in the experimental verification part, the performance of the modular decision forest model is verified by using the digital image dataset Optdigits, virtual currency transaction dataset Elliptic and Ethereum, and compared with graph neural network, logistic regression, random forest, and other models. The results show that the accuracy, recall, and F1-score of the modular decision forest model are significantly improved.

Key words: virtual currency transaction, modular decision forest, denoising autoencoder, fraud detection