Computer Engineering and Applications ›› 2022, Vol. 58 ›› Issue (21): 1-12.DOI: 10.3778/j.issn.1002-8331.2206-0127

• Research Hotspots and Reviews • Previous Articles     Next Articles

Survey of Proximity Methods for Outlier Detection

LIU Caihui, LIU Dijin   

  1. College of Mathematics and Computer Science, Gannan Normal University, Ganzhou, Jiangxi 341000, China
  • Online:2022-11-01 Published:2022-11-01

离群点检测的邻近性方法综述

刘财辉,刘地金   

  1. 赣南师范大学 数学与计算机科学学院,江西 赣州 341000

Abstract: Outlier detection is widely used in data mining, but not all outlier detection problems can be solved by an optimal method. For different applications, different methods are used to solve practical problems most effectively. At present, detection methods can be roughly divided into statistics-based, clustering-based, proximity-based(distance-based and density-based) methods. In order to grasp the current research status of outlier detection methods based on proximity technology, through collation and induction, the representative outlier detection methods based on proximity are introduced and evaluated. it is mainly divided into distance-based method and density-based method, and the application scenarios, algorithm ideas, problems that can be solved and their advantages and disadvantages of all the mentioned methods are analyzed and summarized in detail. The existing problems and the development direction of future research are pointed out. It is of great significance to carry out the research of neighborhood outlier detection.

Key words: proximity, outlier detection, distance-based, density-based

摘要: 离群点检测在数据挖掘中有非常广泛的应用,然而并不是所有的离群点检测问题都能用一种最优的方法去解决。针对不同的应用,需要用不同的方法,才能够最有效地解决实际问题。检测方法大致可以分为基于统计、基于聚类、基于邻近性(基于距离和基于密度)的方法。为了及时掌握当前基于邻近性技术的离群点检测方法的研究现状,通过整理和归纳,将代表性强的基于邻近性的离群点检测方法进行了介绍和评价,将其主要分为基于距离的方法和基于密度的方法,对所有提及的方法的应用场景、算法思想、能解决的问题以及各自的优缺点进行了详细的分析和归纳,指出目前存在的问题和对未来研究的发展方向。对开展邻近性的离群点检测研究具有重要意义。

关键词: 邻近性, 离群点检测, 基于距离, 基于密度