基于遗传聚类算法的离群点检测

计算机工程与应用 ›› 2008, Vol. 44 ›› Issue (11): 155-157.

• 数据库、信号与信息处理 • 上一篇下一篇

基于遗传聚类算法的离群点检测

钱光超,贾瑞玉,张然,李龙澍

安徽大学计算机科学与技术学院，合肥 230039

收稿日期:2007-07-24 修回日期:2007-09-28 出版日期:2008-04-11 发布日期:2008-04-11
通讯作者: 钱光超

Outlier detection based on genetic algorithm for clustering

QIAN Guang-chao,JIA Rui-yu,ZHANG Ran,LI Long-shu

School of Computer Science and Technology，Anhui University，Hefei 230039，China

Received:2007-07-24 Revised:2007-09-28 Online:2008-04-11 Published:2008-04-11
Contact: QIAN Guang-chao

摘要/Abstract

摘要： 离群点检测是数据挖掘一个重要内容，它为分析各种海量的、复杂的、含有噪声的数据提供了新的方法。对离群数据挖掘几类主要的方法进行了分析和评价，并在此基础上了提出了一种基于遗传聚类的离群点检测算法。该算法结合了遗传算法全局搜索的优点和K-均值方法局部收敛速度快的特点，取得较好效果。实验验证该算法很好地检测到数据集中的离群点，同时还完成了数据集的聚类。具有较好的实用性。

关键词: 离群点检测, 数据挖掘, 遗传算法, 聚类, K-均值算法

Abstract: Outlier detection，as an important aspect of data mining，provides a new method for analyzing various quantitative，complex and noisy data.In this paper，authors analyze and evaluate several major methods of the outlier data mining，and propose a new outlier detection algorithm which is based on an genetic algorithm for clustering.By integrating with global searching of the genetic algorithm and the good local convergence rate of the K-means algorithm，this algorithm gets a better result.Experiments show that this algorithm not only can detect the outliers in the dataset，but also complete the clustering of the dataset.So it has a good practicality.

Key words: outlier detection, data mining, genetic algorithm, clustering, K-means algorithm

钱光超,贾瑞玉,张然,李龙澍. 基于遗传聚类算法的离群点检测[J]. 计算机工程与应用, 2008, 44(11): 155-157.

QIAN Guang-chao,JIA Rui-yu,ZHANG Ran,LI Long-shu. Outlier detection based on genetic algorithm for clustering[J]. Computer Engineering and Applications, 2008, 44(11): 155-157.

[1]	兰红，黄敏. 融合KNN优化的密度峰值和FCM聚类算法[J]. 计算机工程与应用, 2021, 57(9): 81-88.
[2]	郭晓静，隋昊达. 改进YOLOv3在机场跑道异物目标检测中的应用[J]. 计算机工程与应用, 2021, 57(8): 249-255.
[3]	李莉，纪欣沅，宋嵩. 回环软件缺陷数量预测模型[J]. 计算机工程与应用, 2021, 57(7): 158-163.
[4]	霍光煜，张勇，孙艳丰，尹宝才. 基于语义的档案数据智能分类方法研究[J]. 计算机工程与应用, 2021, 57(6): 247-253.
[5]	杨芳，尹曦，司建辉，刘宏媛，汪雪. 基于侧重点聚类的数学表达式相似度计算方法[J]. 计算机工程与应用, 2021, 57(6): 88-93.
[6]	宗晓萍，陶泽泽. 基于掌握速度的知识追踪模型[J]. 计算机工程与应用, 2021, 57(6): 117-123.
[7]	赵凡，张琳，闻治泉，杨林林，蔺广逢. 一种直接高效的自然场景汉字逼近定位方法[J]. 计算机工程与应用, 2021, 57(6): 159-167.
[8]	彭启慧，宣士斌，高卿. 分布的自动阈值密度峰值聚类算法[J]. 计算机工程与应用, 2021, 57(5): 71-78.
[9]	李勇振，廖湖声. 基于图卷积神经网络的多视角聚类[J]. 计算机工程与应用, 2021, 57(5): 115-122.
[10]	王昌龙，张远东，缪宏，杨煜恒. 双通道卷积神经网络在南瓜病害识别上的应用[J]. 计算机工程与应用, 2021, 57(5): 183-189.
[11]	胡晓敏，王明丰，张首荣，李敏. 用于文本聚类的新型差分进化粒子群算法[J]. 计算机工程与应用, 2021, 57(4): 61-67.
[12]	王俊玲，卢新明. 基于语义相关的视频关键帧提取算法[J]. 计算机工程与应用, 2021, 57(4): 192-198.
[13]	李昱奇，刘志乾，程凝怡，王莹莹，朱春丽. 多约束条件下无人机航迹规划[J]. 计算机工程与应用, 2021, 57(4): 225-230.
[14]	杨玮，吴莹莹，王婷. 子母式穿梭车仓储系统配置优化问题研究[J]. 计算机工程与应用, 2021, 57(4): 258-265.
[15]	王芙银，张德生，张晓. 结合鲸鱼优化算法的自适应密度峰值聚类算法[J]. 计算机工程与应用, 2021, 57(3): 94-102.

基于遗传聚类算法的离群点检测

Outlier detection based on genetic algorithm for clustering

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics