计算机工程与应用 ›› 2013, Vol. 49 ›› Issue (7): 98-101.
• 网络、通信、安全 • 上一篇 下一篇
薛正元
出版日期:
发布日期:
XUE Zhengyuan
Online:
Published:
摘要: 探讨了基于概率阈值的贝叶斯邮件过滤模型的局限性:由于很少考虑所设定阈值的适用性和实用性,损失了一定的召回率。改进贝叶斯决策,提出了基于随机变量的较小错误分类决策方法;针对邮件处理的特殊性,进一步提出了基于随机变量的较小风险分类决策方法。实验结果表明,处理普通文本分类问题时,前者的分类决策效果更好;而后者在处理邮件问题时性能更优,能够在保持较小误判风险的同时,提高贝叶斯邮件过滤器的召回率以及F值。
关键词: 垃圾邮件, 邮件过滤, 概率, 阈值, 分类决策
Abstract: This paper confers in depth to the limitations of the traditional Bayesian anti-spam mechanism. It seldom thinks about whether the threshold is suitable or not, so the recalling is reduced. Aiming at this question, the paper proposes a lower-error policy decision based on chance variable; and considering the particularity of email classification, a lower-risk policy decision based on chance variable is proposed. The experimental results show that the former one maybe a better way to classify the common text; and the latter one makes better performance on recalling and F value when dealing with emails, at the same time it keeps a lower risk of error judging.
Key words: spam email, email filter, probability, threshold, classify decision
薛正元. 基于改进贝叶斯决策的邮件过滤[J]. 计算机工程与应用, 2013, 49(7): 98-101.
XUE Zhengyuan. Improved probability-based Bayesian anti-spam mechanism[J]. Computer Engineering and Applications, 2013, 49(7): 98-101.
0 / 推荐
导出引用管理器 EndNote|Ris|BibTeX
链接本文: http://cea.ceaj.org/CN/
http://cea.ceaj.org/CN/Y2013/V49/I7/98