计算机工程与应用 ›› 2016, Vol. 52 ›› Issue (3): 70-73.

• 大数据与云计算 • 上一篇    下一篇

DBSCAN算法中参数的自适应确定

李宗林,罗  可   

  1. 长沙理工大学 计算机与通信工程学院,长沙 410114
  • 出版日期:2016-02-01 发布日期:2016-02-03

Research on adaptive parameters determination in DBSCAN algorithm

LI Zonglin, LUO Ke   

  1. Institute of Computer and Communication Engineering, Changsha University of Sciences and Technology, Changsha 410114, China
  • Online:2016-02-01 Published:2016-02-03

摘要: DBSCAN算法需要人为确定[Eps]和[minPts]两个参数,导致聚类结果的准确度直接取决于用户对参数的选择,因此提出一种新的参数确定方法,采用非参数核密度估计理论分析数据样本的分布特征来自动确定[Eps]和[minPts]参数,避免了聚类过程的人工干预,实现聚类过程的自动化。理论分析和实验结果表明,该方法能够选择合理的[Eps]和[minPts]参数,并得到了较高准确度的聚类结果。

关键词: 一种经典的基于密度的聚类算法(DBSCAN), 核密度估计, 自适应, 聚类

Abstract: DBSCAN algorithm needs Eps and minPts two parameters, leading to the accuracy of clustering results directly depends on the user’s choice of parameters, thus this paper puts forward a new method of parameter determination. It adopts nonparametric kernel density estimation theory to analyse the distribution features of the data samples to automatically determine the Eps and minPts parameters, avoiding the manual intervention of clustering process, and achieving automation of clustering process. Theoretical analysis and experimental results show that this method is able to choose reasonable parameters of Eps and minPts and clustering results with higher accuracy are obtained.

Key words: Density Based Spatial Clustering of Applications with Noise(DBSCAN), kernel density estimation, self-adaptive, clustering