计算机工程与应用 ›› 2012, Vol. 48 ›› Issue (34): 116-119.

• 数据库、信号与信息处理 • 上一篇    下一篇

应用特征词分类贡献的垃圾邮件过滤研究

翟军昌1,秦玉平1,车伟伟2   

  1. 1.渤海大学,辽宁 锦州 121000
    2.沈阳大学,沈阳 110044
  • 出版日期:2012-12-01 发布日期:2012-11-30

Feature words classification contribution applied in spam filtering

ZHAI Junchang1, QIN Yuping1, CHE Weiwei2   

  1. 1.Bohai University, Jinzhou, Liaoning 121000, China
    2.Shenyang University, Shenyang 110044, China
  • Online:2012-12-01 Published:2012-11-30

摘要: 在垃圾邮件过滤中,考虑到特征词对合法邮件和垃圾邮件分类贡献的不同,通过定义分类贡献比系数,将特征词分类贡献的思想应用到特征选择和朴素贝叶斯过滤器的设计中,在英文语料库上进行实验,实验结果表明,应用特征词分类贡献的垃圾邮件过滤方法可以有效提高过滤器对合法邮件和垃圾邮件的识别能力,降低过滤器对合法邮件和垃圾邮件的误判率。

关键词: 特征词, 信息增益, 垃圾邮件, 朴素贝叶斯

Abstract: The paper considers the different classification contribution of feature word for spam filtering, through the definition of classification contribution ratio, and applies into feature selection and Na?ve Bayes filter design, finally carries out an experimental on the English corpus. The results show that the application of feature words classification contribution of spam filtering method can effectively improve the recognition ability and lower the misjudgment rate of the filter on the legitimate e-mail and spam.

Key words: feature word, information gain, spam, Na?ve Bayes