Computer Engineering and Applications ›› 2009, Vol. 45 ›› Issue (15): 169-172.DOI: 10.3778/j.issn.1002-8331.2009.15.049
• 数据库、信号与信息处理 • Previous Articles Next Articles
SU Yi-juan
Received:
Revised:
Online:
Published:
Contact:
苏毅娟
通讯作者:
Abstract: Imputing missing values is one of the challenges in data mining and machine learning.Missing values in a dataset can decrease the efficiency of learning algorithm and negatively affect the algorithm.Existing imputation methods for missing values can not fully satisfy the users’ increasing requirements.In this paper,a novel nonparametric algorithm is proposed by using the gray system theory.In this algorithm,missing values are imputed iteratively until the algorithm converges or the output matches to the users’ requirement.Experiments with the UCI dataset demonstrate that our method performs better than many existing algorithms such as the KNN algorithm and the mean method in terms of imputation efficiency.
摘要: 缺失填补是机器学习与数据挖掘领域中极富有挑战性的工作。数据源中的缺失值会对学习算法的性能与学习的质量产生较大的负面影响。目前存在的缺失值填补方法还不能满足用户的需要。提出了一种基于灰色系统理论的缺失值填补方法,该方法采用了基于实例学习的非参拟合和灰色理论技术,对缺失数据进行重复填补,直至填补结果收敛或者满足用户的需要。实验结果表明,该方法在填补效果与效率方面都比现有的KNN填补法和普通的均值替代法要好。
SU Yi-juan. Multiple imputation method for missing values by gray relation analysis[J]. Computer Engineering and Applications, 2009, 45(15): 169-172.
苏毅娟. 基于灰色关联分析的缺失值重复填补方法[J]. 计算机工程与应用, 2009, 45(15): 169-172.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://cea.ceaj.org/EN/10.3778/j.issn.1002-8331.2009.15.049
http://cea.ceaj.org/EN/Y2009/V45/I15/169