Computer Engineering and Applications ›› 2009, Vol. 45 ›› Issue (17): 125-128.DOI: 10.3778/j.issn.1002-8331.2009.17.038

• 数据库、信息处理 • Previous Articles     Next Articles

Research on methods of outlier data mining of using statistical dispersion

SHI Dong-hui   

  1. School of Electronic & Information Engineering,Anhui University of Architecture,Hefei 230088,China
  • Received:2009-02-11 Revised:2009-03-13 Online:2009-06-11 Published:2009-06-11
  • Contact: SHI Dong-hui

使用统计变异指标研究离群数据挖掘方法

史东辉   

  1. 安徽建筑工业学院 电子与信息工程学院,合肥 230088
  • 通讯作者: 史东辉

Abstract: In this paper,the degree to which numeric data tend to spread is called the dispersion,or variance of the data.It allows us to make better understanding of the data’s overall features,and thus understanding the distribution of data which is used to find the outliers data.A kind of outlier mining methods is improved,based on the deviation of data.The improved algorithm is suitable for multi-dimensional numerical data.

Key words: statistical dispersion, outlier mining, deviation of data

摘要: 对统计数据的散度情况,即数据变异指标,进行了说明,变异指标可以使我们对数据的总体特征有更进一步的了解,进而对数据的分布情况有所了解,变异指标对发现数据中的离群数据有一定的作用。作者使用变异指标对基于偏差的离群数据的发现方法进行改进,改进后的算法适合于多维数值数据。

关键词: 统计变异, 离群数据, 偏差数据