Shared nearest neighbor affinity based clustering algorithm

doi:10.3778/j.issn.1002-8331.1705-0401

Abstract

Abstract: In order to solve the problem of inaccurate clustering results when dealing with high-dimensional and multi-density datasets, a Shared Nearest Neighbor Affinity（SNNA） based clustering algorithm is put forward. The algorithm incorporates [k] nearest neighbor and shared nearest neighbor, and defines shared neighbor affinity as the local density measure of the object. The algorithm firstly extracts the core points according to the affinity, then uses the breadth first search algorithm to cluster the core points, and finally assigns the non-core points to the right cluster to complete the clustering of the whole data set. Experimental results show that the algorithm can find clusters of arbitrary shape, size and density. Compared with other similar algorithms, SNNA has higher clustering accuracy when dealing with high-dimensional data.

Key words: clustering, density, shared nearest neighbor, affinity, data mining

摘要： 为解决密度聚类算法在处理高维和多密度数据集时聚类结果不精确的问题，提出一种基于共享近邻亲和度（SNNA）的聚类算法。该算法引入[k]近邻和共享近邻，定义共享近邻亲和度作为对象的局部密度度量。算法首先根据亲和度来提取核心点，然后利用广度优先搜索算法对核心点进行聚类，最后对非核心点进行指派即完成整个数据集的聚类。实验结果表明，该算法能够发现任意形状、大小、密度的聚类；与同类算法相比，SNNA算法在处理高维数据时具有较高的聚类准确率。

关键词: 聚类, 密度, 共享近邻, 亲和度, 数据挖掘

QIU Baozhi, XIN Hang. Shared nearest neighbor affinity based clustering algorithm[J]. Computer Engineering and Applications, 2018, 54(18): 184-187.

邱保志，辛杭. 一种基于共享近邻亲和度的聚类算法[J]. 计算机工程与应用, 2018, 54(18): 184-187.

[1]	LAN Hong, HUANG Min. Fusion of KNN Optimized Density Peaks and FCM Clustering Algorithm [J]. Computer Engineering and Applications, 2021, 57(9): 81-88.
[2]	GUO Xiaojing, SUI Haoda. Application of Improved YOLOv3 in Foreign Object Debris Target Detection on Airfield Pavement [J]. Computer Engineering and Applications, 2021, 57(8): 249-255.
[3]	LI Li, JI Xinyuan, SONG Song. Prediction Model for Number of Software Defects in Loop [J]. Computer Engineering and Applications, 2021, 57(7): 158-163.
[4]	HUO Guangyu, ZHANG Yong, SUN Yanfeng, YIN Baocai. Research on Archive Data Intelligent Classification Based on Semantic [J]. Computer Engineering and Applications, 2021, 57(6): 247-253.
[5]	YANG Fang, YIN Xi, SI Jianhui, LIU Hongyuan, WANG Xue. Mathematical Expression Similarity Calculation Method Based on Focus Clustering [J]. Computer Engineering and Applications, 2021, 57(6): 88-93.
[6]	ZONG Xiaoping, TAO Zeze. Knowledge Tracing Model Based on Mastery Speed [J]. Computer Engineering and Applications, 2021, 57(6): 117-123.
[7]	ZHAO Fan, ZHANG Lin, WEN Zhiquan, YANG Linlin, LIN Guangfeng. Direct and Efficient Natural Scene Chinese Character Approaching Spotting Method [J]. Computer Engineering and Applications, 2021, 57(6): 159-167.
[8]	PENG Qihui, XUAN Shibin, GAO Qing. Distribution Automatic Threshold Density Peak Clustering Algorithm [J]. Computer Engineering and Applications, 2021, 57(5): 71-78.
[9]	LI Yongzhen, LIAO Husheng. Multi-view Clustering via Graph Convolutional Neural Network [J]. Computer Engineering and Applications, 2021, 57(5): 115-122.
[10]	WANG Changlong, ZHANG Yuandong, MIAO Hong, YANG Yuheng. Application of Double Channel Convolutional Neural Network in Pumpkin Diseases Identification [J]. Computer Engineering and Applications, 2021, 57(5): 183-189.
[11]	HU Xiaomin, WANG Mingfeng, ZHANG Shourong, LI Min. New Differential Evolution with Particle Swarm Optimization Algorithm for Text Clustering [J]. Computer Engineering and Applications, 2021, 57(4): 61-67.
[12]	WANG Junling, LU Xinming. Video Key Frame Extraction Algorithm Based on Semantic Correlation [J]. Computer Engineering and Applications, 2021, 57(4): 192-198.
[13]	GAO Tianyu, WANG Qingrong, YANG Lei. Data Mining Model Based on Attribute Dependability Enhancement of Rough Set [J]. Computer Engineering and Applications, 2021, 57(3): 87-93.
[14]	WANG Fuyin, ZHANG Desheng, ZHANG Xiao. Adaptive Density Peaks Clustering Algorithm Combining with Whale Optimization Algorithm [J]. Computer Engineering and Applications, 2021, 57(3): 94-102.
[15]	CHEN Junfeng, ZHENG Zhongtuan. Over-Sampling Method on Imbalanced Data Based on WKMeans and SMOTE [J]. Computer Engineering and Applications, 2021, 57(23): 106-112.

Shared nearest neighbor affinity based clustering algorithm

一种基于共享近邻亲和度的聚类算法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics