Computer Engineering and Applications ›› 2020, Vol. 56 ›› Issue (7): 30-38.DOI: 10.3778/j.issn.1002-8331.1903-0501

Previous Articles     Next Articles

Improved Nearest Neighbor Classification Algorithm for Imbalanced Data

WANG Caiwen, YANG Youlong   

  1. School of Mathematics and Statistics, Xidian University, Xi’an 710126, China
  • Online:2020-04-01 Published:2020-03-28



  1. 西安电子科技大学 数学与统计学院,西安 710126


For the problem of the imbalanced data classification, a Density-based Nearest Neighbor(DNN) classification algorithm is proposed. By keenly capturing the local distribution characteristics of imbalanced data, it can produce better classification results. Firstly, the kernel density estimation method is used to estimate the density of each class of the query instance, thereby performing density localization on it. Secondly, the points in the original data space are mapped to the space composed of information of category density and distance. Finally, in this mapping space, the neighbors are dynamically selected and the query instance is classified. Experimental results show that the DNN algorithm performs well on the classification of 15 imbalanced data sets.

Key words: K nearest neighbor classifier, imbalanced data, classification algorithm, kernel density estimation



关键词: K近邻算法, 不平衡数据, 分类算法, 核密度估计