Computer Engineering and Applications ›› 2020, Vol. 56 ›› Issue (14): 257-263.DOI: 10.3778/j.issn.1002-8331.1904-0496

Previous Articles     Next Articles

Improved Multi-layer Perceptron Applied to Customer Churn Prediction

XIA Guo’en, TANG Qi, ZHANG Xianquan   

  1. 1.College of Computer Science and Information Engineering, Guangxi Normal University, Guilin, Guangxi 541000, China
    2.School of Business Administration, Guangxi University of Finance and Economics, Nanning 530000, China
  • Online:2020-07-15 Published:2020-07-14

改进的多层感知机在客户流失预测中的应用

夏国恩,唐琪,张显全   

  1. 1.广西师范大学 计算机科学与信息工程学院,广西 桂林 541000
    2.广西财经学院 工商管理学院,南宁 530000

Abstract:

To deal with the issues of the increasing data attributes and sparse data, evoked by using one-hot encoding method to encode discrete properties, in the preprocessing of customer churn prediction, this paper proposes two improved customer churn prediction models based on multi-layer perceptron. The main idea is to improve multi-layer perceptron by using stacked auto-encoder and entity embedding respectively. By mapping the high dimensional data of discrete properties into low dimensional space, the methods can reduce the number of sparse data made by one-hot encoding and increase the correlation between different values of discrete properties efficiently. The cross-validation results testing on two public data sets reveal that the improved methods not only increase the accuracy of prediction efficiently but also keep the advantage of traditional multi-layer perceptron in parallel computing.

Key words: customer churn, multi-layer perceptron, discrete attributes, auto-encoder, entity embedding, map

摘要:

针对在传统的客户流失预测数据预处理中,使用one-hot编码处理离散属性导致数据维度增加及数据过于稀疏的问题,提出了两种基于多层感知机的改进后的客户流失预测模型。其主要思想是分别使用堆叠自编码器和实体嵌入两种方法对多层感知机进行改进,通过将离散属性的高维编码数据向低维空间映射,有效地减少了one-hot编码产生的稀疏数据,增加了离散属性值之间的关联度。在对两份公开的数据集进行交叉验证后的实验结果表明,改进后的模型既有效地提高了预测的准确度,又维持了传统多层感知机模型在并行化计算方面的优势。

关键词: 客户流失, 多层感知机, 离散属性, 堆叠自编码器, 实体嵌入, 映射