New algorithm to scale up efficiency of K-Nearest-Neighbor

Computer Engineering and Applications ›› 2008, Vol. 44 ›› Issue (4): 163-165.

• 数据库与信息处理 • Previous Articles Next Articles

New algorithm to scale up efficiency of K-Nearest-Neighbor

LU Wei-wei,LIU Jing

Faculty of Computer Science，China University of Geosciences，Wuhan 430074

Received:2007-06-06 Revised:2007-08-06 Online:2008-02-01 Published:2008-02-01
Contact: LU Wei-wei

一种提高K-近邻算法效率的新算法

陆微微,刘晶

中国地质大学计算机科学系，武汉 430074

通讯作者: 陆微微

Abstract

Abstract: The k-Nearest-Neighbor（KNN） algorithm is the most basic instance-based learning method，and is widely used in machine learning and data mining.Learning in KNN consists of simply storing the presented training data.When a new query instance is encountered，a set of similar related instances is retrieved from memory and used to classify the new query instance.One disadvantage of KNN is that the cost of classifying new instances can be high.This is due to the fact that nearly all computation takes place at classification time rather than when the training instances are first encountered.So，how to efficiently index training instances are a significant practical issue in reducing the computation required at query time.In order to set down this issue，this paper presents a new algorithm.It moves some computations taken place at classification time to the training time.The simulation experiments show that it can scale up the efficiency of KNN beyond 80%.Besides，its idea can be applied to all variants of KNN.

Key words: K-Nearest-Neighbor, instance-based learning, efficiency, classification

摘要： K-近邻（K-Nearest-Neighbor，KNN）算法是一种最基本的基于实例的学习方法，被广泛应用于机器学习与数据挖掘。其学习过程只是简单地存储已知的训练数据。当遇到新的查询实例时，一系列相似的实例被从存储器中取出，并用来分类新的查询实例。KNN的一个不足是分类新实例的开销可能很大。这是因为几乎所有的计算都发生在分类时，而不是在第一次遇到训练实例时。所以，如何有效地索引训练实例，以减少查询时所需计算是一个重要的实践问题。为解决这个问题，提出了一种新的算法。该算法把部分原本发生在分类阶段的计算移到训练阶段来完成。实验表明，算法能够提高KNN效率80%以上。此外，算法的思想还可以应用于KNN的所有变体中。

关键词: K-近邻算法, 基于实例的学习, 效率, 分类

LU Wei-wei,LIU Jing. New algorithm to scale up efficiency of K-Nearest-Neighbor[J]. Computer Engineering and Applications, 2008, 44(4): 163-165.

陆微微,刘晶. 一种提高K-近邻算法效率的新算法[J]. 计算机工程与应用, 2008, 44(4): 163-165.

[1]	YANG Chunxia, LI Xinxu, WU Jiajun, LIU Tianyu. Hierarchical Network Sentiment Classification Based on Attention Interaction Mechanism [J]. Computer Engineering and Applications, 2021, 57(9): 134-139.
[2]	ZHANG Hanyu, WU Zhihao, XU Yong, CHEN Bin. Face Forensics Detection Method Based on Enhanced Convolutional Neural Networks [J]. Computer Engineering and Applications, 2021, 57(8): 220-224.
[3]	HAN Dongfang, Turdy Toheti, Askar Hamdulla. Survey on Question Classification Method in Question Answering System [J]. Computer Engineering and Applications, 2021, 57(6): 10-21.
[4]	HUANG Jinjie, LIN Jiangquan, HE Yongjun, HE Jinjie, WANG Yajun. Chinese Short Text Classification Algorithm Based on Local Semantics and Context [J]. Computer Engineering and Applications, 2021, 57(6): 94-100.
[5]	HAN Weiyu, CHENG Longsheng. Research on Roling Bearing Failure Mode Classification Based on MTS and SVM [J]. Computer Engineering and Applications, 2021, 57(6): 239-246.
[6]	HUO Guangyu, ZHANG Yong, SUN Yanfeng, YIN Baocai. Research on Archive Data Intelligent Classification Based on Semantic [J]. Computer Engineering and Applications, 2021, 57(6): 247-253.
[7]	LI Shuo, LIANG Yi. Prediction Model of Execution Time for Batch Application in Spark [J]. Computer Engineering and Applications, 2021, 57(5): 79-87.
[8]	WANG Fengqin, KE Hengjin. Application of CNN and Its Analysis in Depression Identification [J]. Computer Engineering and Applications, 2021, 57(5): 245-250.
[9]	WAN Yaling, ZHONG Xiwu, LIU Hui, QIAN Yurong. Survey of Application of Convolutional Neural Network in Classification of Hyperspectral Images [J]. Computer Engineering and Applications, 2021, 57(4): 1-10.
[10]	TAO Tiwei, LIU Mingxia, WANG Mingliang, WANG Linlin, YANG Deyun, ZHANG Qiang. Effective Distance Based Low-Rank Representation [J]. Computer Engineering and Applications, 2021, 57(4): 141-147.
[11]	ZHENG Cheng, DONG Chunyang, HUANG Xiayan. Short Text Classification Method Based on BTM Graph Convolutional Network [J]. Computer Engineering and Applications, 2021, 57(4): 155-160.
[12]	SHE Hailong, XIE Shanjuan, ZOU Jingjie. 3D-CNN with Standard Score Dimensionality Reduction for Hyperspectral Remote Sensing Images Classification [J]. Computer Engineering and Applications, 2021, 57(4): 169-175.
[13]	YU Duo, HUANG Yongdong. Hyperspectral Image Classification Based on SPCA and Domain Transform Recursive Filtering [J]. Computer Engineering and Applications, 2021, 57(4): 199-208.
[14]	HU Jie, ZHANG Ying, XIE Shiyi. Summary of Research Progress on Application of Domestic Remote Sensing Image Classification Technology [J]. Computer Engineering and Applications, 2021, 57(3): 1-13.
[15]	HE Wenliang, ZHU Minling. Research Status and Future Analysis of Capsule Neural Network [J]. Computer Engineering and Applications, 2021, 57(3): 33-43.

New algorithm to scale up efficiency of K-Nearest-Neighbor

一种提高K-近邻算法效率的新算法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics