Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (8): 160-168.DOI: 10.3778/j.issn.1002-8331.2001-0242

Previous Articles     Next Articles

Strong Deep Forest with Circular Scanning

ZHOU Bowen, GAO Jun, SHAO Xing   

  1. 1.School of Computer Science, Jiangsu University of Science and Technology, Zhenjiang, Jiangsu 212003, China
    2.School of Information Engineering, Yancheng Institute of Technology, Yancheng, Jiangsu 224002, China
  • Online:2021-04-15 Published:2021-04-23

环状扫描的强深度森林

周博文,皋军,邵星   

  1. 1.江苏科技大学 计算机学院,江苏 镇江 212003
    2.盐城工学院 信息工程学院,江苏 盐城 224002

Abstract:

Deep Forest(DF) model has more advantages than neural network algorithm in dealing with large-scale data, because DF has few super parameters, no too many requirements for parameters setting, convenient training and high robustness. However, in classical deep forest, the multi grained scanning ignores the hidden information carried by the edge data, and cannot fully obtain each feature subset, which affects the following cascading part. Moreover, the new features obtained by the cascading part are limited each time, which affects the representation learning ability of DF. To solve the above problems, Circular Strong Deep Forest(CSDF) is proposed in this paper. Through the circular scanning process, more sufficient feature subsets are obtained in CSDF. In addition, the strong cascaded forest in CSDF improves the representation learning ability of the model through feature selection. After testing on different typical data sets, the results show that CSDF is more superior in performance, and especially obvious in high-dimensional data.

Key words: deep forest, feature subset, representation learning ability, circular scanning

摘要:

深度森林(Deep Forest,DF),由于此模型超参数少,且参数设置没有过多的要求,训练方便,鲁棒性高,因此在处理大型数据时比神经网络算法更加具有优势。但是,传统的深度森林中,多粒度扫描忽略了边缘数据携带的隐含信息,无法充分地获得各个特征子集,进而会对以后的级联部分产生影响。而且,级联部分每次得到的新特征有限,影响了模型的表征学习能力。针对以上问题,提出一种环状强深度森林(Circular Strong Deep Forest,CSDF),其通过环状扫描过程,一定程度上得到更充分的特征子集,且强级联森林通过特征选择提高了模型的表征学习能力。经过在不同数据集上的测试,结果表明,CSDF的性能更加优越,尤其是高维数据上更为明显。

关键词: 深度森林, 特征子集, 表征学习能力, 环状扫描