计算机工程与应用 ›› 2009, Vol. 45 ›› Issue (17): 125-128.DOI: 10.3778/j.issn.1002-8331.2009.17.038

• 数据库、信息处理 • 上一篇    下一篇

使用统计变异指标研究离群数据挖掘方法

史东辉   

  1. 安徽建筑工业学院 电子与信息工程学院,合肥 230088
  • 收稿日期:2009-02-11 修回日期:2009-03-13 出版日期:2009-06-11 发布日期:2009-06-11
  • 通讯作者: 史东辉

Research on methods of outlier data mining of using statistical dispersion

SHI Dong-hui   

  1. School of Electronic & Information Engineering,Anhui University of Architecture,Hefei 230088,China
  • Received:2009-02-11 Revised:2009-03-13 Online:2009-06-11 Published:2009-06-11
  • Contact: SHI Dong-hui

摘要: 对统计数据的散度情况,即数据变异指标,进行了说明,变异指标可以使我们对数据的总体特征有更进一步的了解,进而对数据的分布情况有所了解,变异指标对发现数据中的离群数据有一定的作用。作者使用变异指标对基于偏差的离群数据的发现方法进行改进,改进后的算法适合于多维数值数据。

关键词: 统计变异, 离群数据, 偏差数据

Abstract: In this paper,the degree to which numeric data tend to spread is called the dispersion,or variance of the data.It allows us to make better understanding of the data’s overall features,and thus understanding the distribution of data which is used to find the outliers data.A kind of outlier mining methods is improved,based on the deviation of data.The improved algorithm is suitable for multi-dimensional numerical data.

Key words: statistical dispersion, outlier mining, deviation of data