Improved probability-based Bayesian anti-spam mechanism

Abstract

Abstract: This paper confers in depth to the limitations of the traditional Bayesian anti-spam mechanism. It seldom thinks about whether the threshold is suitable or not, so the recalling is reduced. Aiming at this question, the paper proposes a lower-error policy decision based on chance variable; and considering the particularity of email classification, a lower-risk policy decision based on chance variable is proposed. The experimental results show that the former one maybe a better way to classify the common text; and the latter one makes better performance on recalling and F value when dealing with emails, at the same time it keeps a lower risk of error judging.

Key words: spam email, email filter, probability, threshold, classify decision

摘要： 探讨了基于概率阈值的贝叶斯邮件过滤模型的局限性：由于很少考虑所设定阈值的适用性和实用性，损失了一定的召回率。改进贝叶斯决策，提出了基于随机变量的较小错误分类决策方法；针对邮件处理的特殊性，进一步提出了基于随机变量的较小风险分类决策方法。实验结果表明，处理普通文本分类问题时，前者的分类决策效果更好；而后者在处理邮件问题时性能更优，能够在保持较小误判风险的同时，提高贝叶斯邮件过滤器的召回率以及F值。

关键词: 垃圾邮件, 邮件过滤, 概率, 阈值, 分类决策

XUE Zhengyuan. Improved probability-based Bayesian anti-spam mechanism[J]. Computer Engineering and Applications, 2013, 49(7): 98-101.

薛正元. 基于改进贝叶斯决策的邮件过滤[J]. 计算机工程与应用, 2013, 49(7): 98-101.

[1]	XU Xiaoyuan, LI Haibo, HUANG Li. Convex Optimization Analysis of Joint Delay Tail Probability of Multi-heterogeneous Files in Cloud Storage [J]. Computer Engineering and Applications, 2021, 57(5): 88-94.
[2]	HUI Hui, GOU Bo, WANG Ying. Secure Transmission Scheme for Successive Relaying Networks Under Eavesdropping Environment [J]. Computer Engineering and Applications, 2021, 57(4): 77-82.
[3]	YU Duo, HUANG Yongdong. Hyperspectral Image Classification Based on SPCA and Domain Transform Recursive Filtering [J]. Computer Engineering and Applications, 2021, 57(4): 199-208.
[4]	MEI Jie, WEI Yuanyuan, XU Taosheng. Fusion Clustering Algorithm Based on Multi-Prototypes Using Density Peaks [J]. Computer Engineering and Applications, 2021, 57(22): 78-85.
[5]	QI Xiaoxiang, LI Min, ZHU Ying, SONG Yu, DU Weidong. Adaptive Region Segmentation of SAR Image Based on Edge Detection [J]. Computer Engineering and Applications, 2021, 57(22): 232-240.
[6]	WANG Yingbo, SUN Yongdi. GNN-Based Matrix Factorization Recommendation Algorithm [J]. Computer Engineering and Applications, 2021, 57(19): 129-134.
[7]	ZHU Yongming, QIU Wenjing. Correlation Coefficient of Probability Multi-valued Neutrosophic Set and Its Application [J]. Computer Engineering and Applications, 2021, 57(15): 186-192.
[8]	SUN Weijie, YANG Jun. Research on Remote Sensing Image Segmentation Based on Improved Simple Non-iterative Clustering [J]. Computer Engineering and Applications, 2021, 57(13): 185-192.
[9]	CHEN Xiaowen, LIU Guangshuai, LIU Wanghua, LI Xurui. Pairwise Rotation-Invariant Co-occurrence Adaptive Complete Local Ternary Pattern [J]. Computer Engineering and Applications, 2021, 57(1): 219-226.
[10]	DAN Yufang, TAO Jianwen, XU Haote. Semi-Supervised Classification Method of Possibilistic Clustering Assumption [J]. Computer Engineering and Applications, 2020, 56(9): 65-74.
[11]	PENG Jiayin. Controlled Bidirectional Remote Quantum Control [J]. Computer Engineering and Applications, 2020, 56(9): 117-124.
[12]	LU Junjie, HUANG Jinquan, LU Feng. Likelihood K-means Clustering for Gas Path Failure Diagnostics of Turbofan Engine [J]. Computer Engineering and Applications, 2020, 56(9): 136-141.
[13]	LIU Jie, FANG Jun, LEI Fengjin. On-Line Detection Method for Abnormal Data of Power Quality [J]. Computer Engineering and Applications, 2020, 56(9): 240-247.
[14]	WANG Wenhui, LI Peng, HU Yundi. Measurement Set Partitioning Algorithm for Extended Target Based on Target Prediction [J]. Computer Engineering and Applications, 2020, 56(8): 143-148.
[15]	WANG Qi, XUE Hong, CHEN Maomao. Simulation of Ruin Probability for Risk Model Perturbed by Fractional Brown Motion [J]. Computer Engineering and Applications, 2020, 56(8): 215-219.

Improved probability-based Bayesian anti-spam mechanism

基于改进贝叶斯决策的邮件过滤

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics