Computer Engineering and Applications ›› 2009, Vol. 45 ›› Issue (14): 145-148.DOI: 10.3778/j.issn.1002-8331.2009.14.044

• 数据库、信号与信息处理 • Previous Articles     Next Articles

Improved Naïve Bayesian spam filtering algorithm

ZHAI Jun-chang1,2,QIN Yu-ping2,WANG Chun-li3   

  1. 1.Dept. of Public Computer Teaching & Research,Bohai University,Jinzhou,Liaoning 121000,China
    2.College of Information Science and Technology,Bohai University,Jinzhou,Liaoning 121000,China
    3.School of Computer Science and Technology,Dalian Maritime University,Dalian,Liaoning 116023,China
  • Received:2008-11-03 Revised:2009-01-15 Online:2009-05-11 Published:2009-05-11
  • Contact: ZHAI Jun-chang

改进的朴素贝叶斯垃圾邮件过滤算法

翟军昌1,2,秦玉平2,王春立3   

  1. 1.渤海大学 公共计算机教研部,辽宁 锦州 121000
    2.渤海大学 信息科学与工程学院,辽宁 锦州 121000
    3.大连海事大学 计算机科学与技术学院,辽宁 大连 116023
  • 通讯作者: 翟军昌

Abstract: The paper describes the Naïve Bayesian spam filtering algorithms.In terms of probability calculation of Naïve Bayes algorithm,the paper selects calculation of multi-variable model of Bernoulli event,and makes improvements to multi-variable model of Bernoulli event,and carries out an experimental on the Ling-Spam corpus.The results show that the improved algorithm can effectively enhance the recall and accuracy of the filter and lower the error rate of the filter.

摘要: 介绍了朴素贝叶斯垃圾邮件过滤算法,对于朴素贝叶斯算法中条件概率的计算,选用了多变量贝努里事件模型的计算方法,在多变量贝努里事件模型的基础上进行了改进,并在Ling-Spam语料库上进行实验,实验结果表明改进后的算法有效地提高了过滤器的召回率和精确率,并且降低了过滤器的错误率。