Computer Engineering and Applications ›› 2014, Vol. 50 ›› Issue (9): 116-122.

Previous Articles     Next Articles

Study on feature selection method of modified maximal relevance minimal redundancy

YAO Minghai1,2, WANG Na3, QI Miao2, LI Yan4   

  1. 1.College of Information Science and Technology, Bohai University, Jinzhou, Liaoning 121013, China
    2.School of Computer Science and Information Technology, Northeast Normal University, Changchun 130117, China
    3.Department of Computer, Jinzhou Teachers Training College, Jinzhou, Liaoning 121013, China
    4.Dalian Women and Employment Guidance Service Center, Dalian, Liaoning 116001, China
  • Online:2014-05-01 Published:2014-05-14

改进的最大相关最小冗余特征选择方法研究

姚明海1,2,王  娜3,齐  妙2,李  妍4   

  1. 1.渤海大学 信息科学与技术学院,辽宁 锦州 121013
    2.东北师范大学 计算机科学与信息技术学院,长春 130117
    3.锦州师范高等专科学校 计算机系,辽宁 锦州 121013
    4.大连市妇女创就业指导服务中心,辽宁 大连 116001

Abstract: Feature selection as an important preliminary work has been concerned in various fields. Through analyzing the existing feature selection methods, the problem is improved that the single redundancy and relevance evaluation method and feature dimension cannot be set according to user requirements. A novel simple and fast computing method is presented in the redundant calculation process; the weight is calculated according to the data different choice of different evaluation methods; the novel evaluation function is used in feature selection. With five different databases(FERET、CASIA、ORL、CMU PIE and Extended YaleB), the effectiveness and feasibility of the algorithm are proved. The experimental results demonstrate the advantage of the MMRMR.

Key words: feature selection, Minimal Redundancy Maximal Relevance(MRMR), biometric identification, evaluation function, regular databases

摘要: 特征选择方法作为重要的数据预处理工作一直受到各个领域的关注。在分析现有的特征选择方法的基础上,针对MRMR方法中存在的冗余度和相关性评价方法单一,不能根据用户需求设置特征维度等问题进行了改进。在冗余度计算过程提出一种新的简单快速的计算方法;在计算权重过程中提出针对不同数据选用不同的特征评价方法;引入新的目标评价函数来进行特征选择。在五个经典的用于生物认证领域的特征数据库(FERET、CASIA、ORL、PIE和扩展的YaleB)上验证了算法的有效性,实验结果充分证明了改进的最大相关最小冗余算法的优势。

关键词: 特征选择, 最大相关最小冗余(MRMR), 生物认证, 评价函数, 经典数据库