Computer Engineering and Applications ›› 2007, Vol. 43 ›› Issue (13): 186-190.

• 数据库与信息处理 • Previous Articles     Next Articles

Spam Filtering Model Research Based on Advanced Naïve Bayes

Tao Wang Guo-yong Qiu Ju-hou He   

  • Received:2007-01-01 Revised:1900-01-01 Online:2007-05-01 Published:2007-05-01
  • Contact: Tao Wang

基于改进Naïve Bayes的垃圾邮件过滤模型研究

王涛 裘国永 何聚厚   

  1. 陕西师范大学 陕西师范大学 陕西师范大学
  • 通讯作者: 王涛

Abstract: The widely used Naïve Bayes Filtering(NBF) model in Spam Filtering is analyzed and the shortcoming of Expected Cross Entropy (ECE)method which is used on Feature Selection is pointed out. Using the weighted-feature method to improve NBF’s filtering effect during classification, the paper puts forward the Advanced Naïve Bayes Filter(A-NBF). The test result demonstrates that A-NBF has better performance than NBF.

Key words: Feature Selection, Spam Filtering, Naï, ve Bayes, Expected Cross Entropy

摘要: 分析了目前在垃圾邮件过滤中广泛应用的Naïve Bayes过滤模型(NBF),指出了期望交叉熵(ECE)特征词选取方法的不足。提出了改进的Naïve Bayes垃圾邮件过滤模型(A-NBF),用改进的期望交叉熵(AECE)选取垃圾邮件特征词,并在邮件分类过程中对特征词进行加权,从而提高对垃圾邮件过滤的精度。实验结果可以看出A-NBF比NBF在过滤精度方面有明显的提高。

关键词: 特征选取, 垃圾邮件过滤, 朴素贝叶斯, 期望交叉熵