计算机工程与应用 ›› 2008, Vol. 44 ›› Issue (16): 145-147.

• 数据库、信号与信息处理 • 上一篇    下一篇

常规挖掘算法在离群数据检测中的应用

金义富1,朱庆生2   

  1. 1.湛江师范学院 信息学院,广东 湛江 524048
    2.重庆大学 计算机学院,重庆 400044
  • 收稿日期:2007-12-10 修回日期:2008-03-03 出版日期:2008-06-01 发布日期:2008-06-01
  • 通讯作者: 金义富

Application of regular data mining algorithms in outlier detection

JIN Yi-fu1,ZHU Qing-sheng2   

  1. 1.School of Information,Zhanjiang Normal University,Zhanjiang,Guangdong 524048,China
    2.College of Computer,Chongqing University,Chongqing 400044,China
  • Received:2007-12-10 Revised:2008-03-03 Online:2008-06-01 Published:2008-06-01
  • Contact: JIN Yi-fu

摘要: 数据挖掘以发现常规模式为主体,但离群数据在欺诈分析及安全领域具有重要分析价值,离群数据检测已成为数据挖掘的重要内容。对聚类与分类以及关联规则分析中典型的常规数据挖掘算法如何处理离群数据进行全面分析与总结,讨论了BIRCH、CURE、Chameleon、DBSCAN以及基于共享最近邻的聚类算法以及基于不平衡分类和基于非频繁模式的离群检测技术,给出了一种利用K-最近邻算法的离群数据检测方法,并报告了测试结果。

关键词: 数据挖掘, 常规算法, 离群检测, 应用

Abstract: In general,data mining is mainly discovering for regular patterns.It is an important part of data mining to detect outliers as the significance of analyzing for outliers is great in fraud analysis and security fields.This paper analyzes and summarizes roundly typical regular data mining algorithms in clustering,classification and association rules how to deal with outliers.Outlier detection methods based on unbalanced classification,stering algorithms such as BIRCH,CURE,Chameleon,DBSCAN and shared near neighbour are discussed mainly.An outlier detection algorithm based K-near neighbour is put forward in the paper,and its test result is reported.

Key words: data mining, regular algorithm, outlier detection, application