计算机工程与应用 ›› 2019, Vol. 55 ›› Issue (17): 95-99.DOI: 10.3778/j.issn.1002-8331.1806-0029

• 大数据与云计算 • 上一篇    下一篇

缺失数据的维数增量式特征选择

刘吉超,王锋,宋鹏   

  1. 1.山西大学 计算机与信息技术学院,太原 030006
    2.山西大学 经济与管理学院,太原 030006
  • 出版日期:2019-09-01 发布日期:2019-08-30

Dimension Incremental Feature Selection Algorithm for Missing Data

LIU Jichao, WANG Feng, SONG Peng   

  1. 1.College of Computer and Information Technology, Shanxi University, Taiyuan 030006, China
    2.School of Economics and Management, Shanxi University, Taiyuan 030006, China
  • Online:2019-09-01 Published:2019-08-30

摘要: 现如今生活当中的数据大多都是动态变化的,并且在数据动态增加的过程中,许多特征中包含有缺失数据。如何处理动态变化的含有缺失数据的数据集的特征选择成为一个亟需解决的问题。为此,基于粗糙集理论,通过更新互补信息熵在含有缺失数据的数据集维数增加时的更新机制,进而提出一种缺失数据维数增量式特征选择算法,并通过实验进一步验证了算法的可行性与高效性。

关键词: 缺失数据, 粗糙集, 互补信息熵, 特征选择

Abstract: Nowadays, most of the data are dynamically changing in daily life. As the data grow dynamically, many features contain missing data. How to deal with dynamic feature sets with missing data sets becomes an urgent problem. So, based on the rough sets theory, this paper updates the mechanism of complementary information entropy when the dimension of data set with missing data increases. Then, an incremental feature selection algorithm of missing data dimension is proposed. Finally, the feasibility and efficiency of the algorithm are verified by experiments.

Key words: missing data, rough sets, complementary information entropy, feature selection