%0 Journal Article
%A WANG Ziqi
%A HE Jinwen
%A JIANG Liangxiao
%T New Redundancy-Based Algorithm for Reducing Amount of Training Examples in KNN
%D 2019
%R 10.3778/j.issn.1002-8331.1809-0275
%J Computer Engineering and Applications
%P 40-45
%V 55
%N 22
%X As one of the top 10 algorithms in data mining, the K-Nearest-Neighbor（KNN） algorithm is widely used because it is an non-parametric, simple and effective algorithm without training time. However, when it faces to massive amount of high-dimensional training examples, its high classification time complexity becomes a bottleneck of its application. In addition, its classification performance is often harmed, when the class distribution of training examples is skewed and the class imbalance problem occurs. To address these two issues, this paper proposes a new redundancy-based algorithm for reducing the amount of training examples （simply RBKNN）. RBKNN at first computes the redundancy of each training example, and then randomly deletes some high redundant training examples by introducing a pre-processing process. RBKNN can not only reduce the size of training example set, but also make the class distribution of training examples more balanced. The experimental results show that RBKNN significantly promotes the efficiency of KNN, yet at the same time maintains or improves the classification accuracy of KNN.
%U http://cea.ceaj.org/EN/10.3778/j.issn.1002-8331.1809-0275