计算机工程与应用 ›› 2011, Vol. 47 ›› Issue (12): 130-132.
• 数据库、信号与信息处理 • 上一篇 下一篇
刘庆和,梁正友
收稿日期:
修回日期:
出版日期:
发布日期:
LIU Qinghe,LIANG Zhengyou
Received:
Revised:
Online:
Published:
摘要: 特征选择是文本分类的一个重要环节,它可以有效提高分类精度和效率。在研究文本分类特征选择方法的基础上,分析了信息增益方法的不足,将频度、集中度、分散度应用到信息增益方法上,提出了一种基于信息增益的特征优化选择方法。实验表明,该方法在分类效果与性能上都优于传统方法。
关键词: 特征选择, 信息增益, 频度, 集中度, 分散度
Abstract: Feature selection is an essential part of text categorization,which can effectively improve classification precision and efficiency.With some drawbacks proposed from traditional IG approach,an optimized approach that takes frequency,concentration and distribution into account is proposed for improving IG approach.The experimental results show that the improved IG approach is superior to traditional IG approach in feature selection.
Key words: feature selection, information gain, frequency, concentration, distribution
刘庆和,梁正友. 一种基于信息增益的特征优化选择方法[J]. 计算机工程与应用, 2011, 47(12): 130-132.
LIU Qinghe,LIANG Zhengyou. Optimized approach of feature selection based on information gain[J]. Computer Engineering and Applications, 2011, 47(12): 130-132.
0 / 推荐
导出引用管理器 EndNote|Ris|BibTeX
链接本文: http://cea.ceaj.org/CN/
http://cea.ceaj.org/CN/Y2011/V47/I12/130