计算机工程与应用 ›› 2019, Vol. 55 ›› Issue (3): 154-158.DOI: 10.3778/j.issn.1002-8331.1709-0453

• 模式识别与人工智能 • 上一篇    下一篇

基于数据场的改进LOF算法

孟海东1,2,孙新军2,宋宇辰1   

  1. 1.内蒙古科技大学 矿业研究院,内蒙古 包头 014010
    2.内蒙古科技大学 信息工程学院,内蒙古 包头 014010
  • 出版日期:2019-02-01 发布日期:2019-01-24

Improved LOF Algorithm Based on Data Field

MENG Haidong1,2, SUN Xinjun2, SONG Yuchen1   

  1. 1.School of Mining Research, Inner Mongolia University of Science and Technology, Baotou, Inner Mongolia 014010, China
    2.School of Information Engineering, Inner Mongolia University of Science and Technology, Baotou, Inner Mongolia 014010, China
  • Online:2019-02-01 Published:2019-01-24

摘要: LOF(Local Outlier Factor)是一种经典基于密度的局部离群点检测算法,为提高算法的精确度,以便更精准挖掘出局部离群点,在LOF算法的基础上,提出了一种基于数据场的改进LOF离群点检测算法。通过对数据集每一维的属性值应用数据场理论,计算势值,进而引入平均势差的概念,针对每一维度中大于平均势差的任意两点在计算距离时加入一个权值,从而提高离群点检测的精确度,实验结果表明该算法是可行的,并且拥有更高的精确度。

关键词: 数据挖掘, 局部可达密度, 数据场, 平均势差, 局部离群因子

Abstract: LOF(Local Outlier Factor) is a classical local outlier detection algorithm based on density. In order to improve the accuracy of the algorithm and dig out the local outlier more accurately, an improvement of LOF outlier detection algorithm based on field data is proposed on the basis of LOF algorithm. Firstly, the potential value is calculated by applying the data field theory to the attribute value of each dimension in the data set. Then a weighted value is added to two random points in each dimension which is larger than the average potential difference by introducing the concept of mean potential difference to improve the accuracy of outlier detection. Experimental results show that the algorithm is feasible and has higher degree of accuracy.

Key words: data mining, local reach ability density, data field, average potential difference, local outlier factor