Spectral clustering based on local density and geodesic distance

doi:10.3778/j.issn.1002-8331.1509-0131

Computer Engineering and Applications ›› 2017, Vol. 53 ›› Issue (7): 141-146.DOI: 10.3778/j.issn.1002-8331.1509-0131

Previous Articles Next Articles

Spectral clustering based on local density and geodesic distance

ZHANG Tao1, 2, GE Hongwei1, 2, SU Hui1, 2, ZHANG Huanqing2

1.Key Laboratory of Advanced Process Control for Light Industry（Jiangnan University）, Ministry of Education, Wuxi, Jiangsu 214122, China
2.School of Internet of Things, Jiangnan University, Wuxi, Jiangsu 214122, China

Online:2017-04-01 Published:2017-04-01

基于局部密度和测地距离的谱聚类

张涛1，2，葛洪伟1，2，苏辉1，2，张欢庆2

1.轻工过程先进控制教育部重点实验室（江南大学），江苏无锡 214122
2.江南大学物联网工程学院，江苏无锡 214122

Abstract

Abstract: The traditional geodesic distance calculation method based on [K]-nearest neighbor graph can find the data relationships on manifold. However, if there are adhesive points between different clusters, the similarity based on traditional geodesic distance can not effectively reflect the true relationships between points, which make it difficult to cluster the datasets effectively. Since the similarity based on traditional geodesic distance can not deal with adhesive datasets effectively, spectral clustering based on local density and geodesic distance is proposed. Firstly, the local density of each point is calculated to find the nearest high density point, and choose marginal points and non-marginal points. Then, the edge between the marginal point and its nearest high density point is created, and edges between the [K]-nearest neighbor points and non-marginal points are created as well. Finally, the geodesic distance and similarity can be calculated to perform clustering. Experiments on artificial and UCI datasets show that the proposed algorithm can improve the clustering accuracy when dealing with the adhesive datasets.

Key words: [K]-nearest neighbor graph, geodesic distance, local density, similarity, spectral clustering

摘要： 传统根据[K]-近邻图计算测地距离的方法，虽然能够发现流形分布数据间的相似关系，但是当不同类的点存在粘连关系时，依此计算相似度时不能体现样本间的真实关系，从而无法有效聚类。针对传统测地距离计算相似度的方法不能有效处理粘连数据集的问题，提出了基于局部密度和测地距离的谱聚类方法。计算样本的局部密度，寻找每个样本点的最近高密度点，并选择边缘点和非边缘点；在边缘点和其最近高密度点之间构造边、非边缘点之间的[K]个近邻点构造边，依此计算测地距离和相似度并进行聚类。在人工数据集和UCI数据集上的实验表明，该算法在处理粘连数据集时有效提高了聚类准确率。

关键词: [K]-近邻图, 测地距离, 局部密度, 相似度, 谱聚类

ZHANG Tao1, 2, GE Hongwei1, 2, SU Hui1, 2, ZHANG Huanqing2. Spectral clustering based on local density and geodesic distance[J]. Computer Engineering and Applications, 2017, 53(7): 141-146.

张涛1，2，葛洪伟1，2，苏辉1，2，张欢庆2. 基于局部密度和测地距离的谱聚类[J]. 计算机工程与应用, 2017, 53(7): 141-146.

[1]	ZHANG Qishan, CHEN Lulu. Slope One Algorithm Based on Grey Correlational Analysis by Method of Degree of Balance and Approach [J]. Computer Engineering and Applications, 2021, 57(9): 96-102.
[2]	WANG Yonggui, LI Qianyu. Hybrid Collaborative Filtering Recommendation Algorithm Based on KNN-GBDT [J]. Computer Engineering and Applications, 2021, 57(9): 103-108.
[3]	ZHANG Xiaowen, REN Yongfeng. Image Matching Algorithm Combining Sparse Representation and Topological Similarity [J]. Computer Engineering and Applications, 2021, 57(8): 198-203.
[4]	ZHANG Songcan, PU Jiexin, SI Yanna, SUN Lifan. Adaptive Improved Ant Colony Algorithm Based on Population Similarity and Its Application [J]. Computer Engineering and Applications, 2021, 57(8): 70-77.
[5]	YANG Fang, YIN Xi, SI Jianhui, LIU Hongyuan, WANG Xue. Mathematical Expression Similarity Calculation Method Based on Focus Clustering [J]. Computer Engineering and Applications, 2021, 57(6): 88-93.
[6]	ZHAO Fan, ZHANG Lin, WEN Zhiquan, YANG Linlin, LIN Guangfeng. Direct and Efficient Natural Scene Chinese Character Approaching Spotting Method [J]. Computer Engineering and Applications, 2021, 57(6): 159-167.
[7]	QIAN Yunyun, YANG Wenzhong, YAO Miao, LI Hailei, CHAI Yachuang. Topic Community Discovery Model Incorporating Topic Similarity Weight [J]. Computer Engineering and Applications, 2021, 57(5): 107-114.
[8]	JIANG Bin, LIANG Xiao’an, ZHANG Liang, GAO Yangjun. Evidence Combination Method Based on Improved Modified Weight [J]. Computer Engineering and Applications, 2021, 57(24): 100-106.
[9]	TIAN Wei’an, CHEN Hongmei, ZHOU Lihua. Diversified Recommendation Method Based on Similar Users’Curiosity [J]. Computer Engineering and Applications, 2021, 57(23): 113-121.
[10]	LIANG Tian, CAO Dexin. Improved and Simplified Particle Swarm Optimization Algorithm Based on Levy Flight [J]. Computer Engineering and Applications, 2021, 57(20): 188-196.
[11]	WEI Dingfeng, LI Liang, CHAI Jing. Social Recommendation Algorithm by Fusing Item Information [J]. Computer Engineering and Applications, 2021, 57(19): 198-204.
[12]	LIU Li. Top-N Recommendation Algorithm Based on User Diversity Preference [J]. Computer Engineering and Applications, 2021, 57(17): 116-121.
[13]	YANG Yanjiao, ZHAO Guotao, WANG Pidong. Sentence Similarity Calculation Method Based on Semantics and Emotion [J]. Computer Engineering and Applications, 2021, 57(16): 151-158.
[14]	ZHANG Tao, YU Jiong, LIAO Bin, BI Xuehua. Method for Attributed Graph Summarization Based on Minimum Description Length [J]. Computer Engineering and Applications, 2021, 57(15): 124-132.
[15]	ZHAO Qi, DU Yanhui, LU Tianliang, SHEN Shaoyu. Algorithm of Text Similarity Analysis Based on Capsule-BiGRU [J]. Computer Engineering and Applications, 2021, 57(15): 171-177.

Spectral clustering based on local density and geodesic distance

基于局部密度和测地距离的谱聚类

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics