计算机工程与应用 ›› 2009, Vol. 45 ›› Issue (15): 54-57.DOI: 10.3778/j.issn.1002-8331.2009.15.016

• 研究、探讨 • 上一篇    下一篇

特征选择中信息熵的应用

李杨寰,高 峰,李 腾,周智敏   

  1. 国防科技大学 电子科学与工程学院,长沙 410073
  • 收稿日期:2008-03-31 修回日期:2008-06-30 出版日期:2009-05-21 发布日期:2009-05-21
  • 通讯作者: 李杨寰

Novel method for feature selection based on entropy

LI Yang-huan,GAO Feng,LI Teng,ZHOU Zhi-min   

  1. School of Electronic Science and Engineering,NUDT,Changsha 410073,China
  • Received:2008-03-31 Revised:2008-06-30 Online:2009-05-21 Published:2009-05-21
  • Contact: LI Yang-huan

摘要: 将信息论中熵的概念应用到特征选择中,定义了两种信息测度评价特征——误差熵和混叠熵,然后阐述了两种定义的不用物理意义,分析了计算熵中最关键的区间划分问题,并提出一种较好的区间划分方法。由于熵不能将相似的特征进行剔除,结合相似系数提出了一套完整的基于熵的特征选择过程,并通过仿真实验进行验证。

关键词: 特征选择, 混叠熵, 误差熵, 相似度, 特征评价, 信息测度

Abstract: This paper applies the entropy concept of information theory to feature selection,and defines two information measurements overlap entropy and error entropy that are used to evaluate features.Then the physical meanings of the two definiens are explained.The crucial problem about how to divide the interzones is provided in the following.Then this paper adds a kind of similarity to form a complete feature selection process in order to involve the defect of entropy that it can not filter similar features.An emulation is made to prove the validity of this method at the end.

Key words: feature selection, overlap-entropy, erro-entropy, similarity, feature evalution, information measurement