Computer Engineering and Applications ›› 2019, Vol. 55 ›› Issue (4): 1-16.DOI: 10.3778/j.issn.1002-8331.1810-0420
Previous Articles Next Articles
XIANG Hongxin1, YANG Yun1,2
Online:
Published:
向鸿鑫1,杨 云1,2
Abstract: In recent years, the classification algorithms have made great progress. But as data sources continue to expand, most of the obtained data are unbalanced. These classification algorithms are usually sensitive to unbalanced data, so the classification of unbalanced data becomes very difficult. At present, the unbalanced data mining methods are mainly divided into two aspects, which are preprocessing methods and mining algorithms for unbalanced data. This paper summarizes the two aspects of the methods and makes a multi-dimensional combing from data preprocessing, algorithms and performance evaluation methods in recent years. Then, starting from different application fields, this paper describes all kinds of the unbalanced data problems, as well as the research and solutions of different scholars in their fields. Finally, the existing problems in the field of unbalanced data mining are analyzed, and the future research directions are prospected.
Key words: imbalanced data, sampling, cluster method, ensemble method, cost sensitive, performance evaluation
摘要: 近些年,分类算法取得了长足的发展。但是随着数据来源的不断扩大,人们获得的数据绝大部分是不平衡数据。而这些分类算法通常对不平衡数据敏感,因此对不平衡数据的分类变得十分困难。目前对不平衡数据挖掘方法主要分为两大方面,分别是针对不平衡数据的预处理方法和挖掘算法。就这两大方面对近些年出现的方法进行总结,并从数据预处理、算法和性能评估方法等方面进行多维度梳理。从不同的应用领域入手,讲述了存在的各种不平衡问题,以及不同学者在其领域中的研究和解决方法。最后分析了不平衡数据挖掘领域目前存在的问题,并对未来研究方向进行展望。
关键词: 不平衡数据, 采样, 聚类方法, 集成方法, 代价敏感, 性能评估
XIANG Hongxin1, YANG Yun1,2. Survey on Imbalanced Data Mining Methods[J]. Computer Engineering and Applications, 2019, 55(4): 1-16.
向鸿鑫1,杨 云1,2. 不平衡数据挖掘方法综述[J]. 计算机工程与应用, 2019, 55(4): 1-16.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://cea.ceaj.org/EN/10.3778/j.issn.1002-8331.1810-0420
http://cea.ceaj.org/EN/Y2019/V55/I4/1