计算机工程与应用 ›› 2019, Vol. 55 ›› Issue (23): 45-52.DOI: 10.3778/j.issn.1002-8331.1905-0388

• 理论与研发 • 上一篇    下一篇

使用环形过滤器的[K]值自适应KNN算法

张万桢,刘同来,邬满,匡振曦   

  1. 1.桂林航天工业学院 实践教学部,广西 桂林 541004
    2.广西密码学与信息安全重点实验室,广西 桂林 541004
    3.广东工业大学 计算机学院,广州 510006
    4.南宁师范大学 北部湾环境演变与资源利用教育部重点实验室,南宁 530001
  • 出版日期:2019-12-01 发布日期:2019-12-11

K-Value Adaptive KNN Algorithm Using Annular Filter

ZHANG Wanzhen, LIU Tonglai, WU Man, KUANG Zhenxi   

  1. 1.Department of Practice Teaching, Guilin University of Aerospace Technology, Guilin, Guangxi 541004, China
    2.Guangxi Key Laboratory of Cryptography and Information Security, Guilin, Guangxi 541004, China
    3.School of Computers, Guangdong University of Technology, Guangzhou 510006, China
    4.Key Laboratory of Environment Change and Resources Use in Beibu Gulf, Nanning Normal University, Nanning 530001, China
  • Online:2019-12-01 Published:2019-12-11

摘要: 针对传统KNN算法[K]值固定问题,提出基于环形过滤器的[K]值自适应KNN算法(K-value Adaptive KNN Algorithm Based on Annular Filter,AAKNN),其核心思想是利用稀疏向量能够较好地表达数据之间的相似度信息来动态选择每个测试点的[K]个最近邻点,从而提高算法的准确率。该算法不仅能够根据不同测试点的实际情况来选择不同的[K]值,而且利用环形过滤器避免了内存占用过大的问题。最后通过6组公开数据集对所提出的AAKNN算法进行了实验验证。实验结果表明,AAKNN与CM-KNN算法相比较于其余四种算法在准确率上平均提高2%,其中AAKNN算法相比较CM-KNN算法可以平均减少79%的内存消耗。

关键词: 环形过滤器, 聚类, 分类, 三角不等式, 稀疏向量

Abstract: Aiming at the problem of fixed K-value within the traditional KNN algorithm, a K-value Adaptive KNN Algorithm Based on Annular Filter(AAKNN) is proposed. Its core idea is to dynamically select K-nearest neighbors at each test point by using sparse vector to better express the similarity information among data, which improves the accuracy of the algorithm. The algorithm can not only select different K values according to the actual situation at different test points, but also avoid the problem of high memory consumption by using ring filter. Finally, the proposed AAKNN algorithm is verified on six open datasets. The experimental results show that the accuracy of AAKNN is 2% higher than CM-KNN, and the memory consumption of AAKNN is 79% less than CM-KNN.

Key words: annular filter, clustering, classification, triangular inequality, sparse vectors