计算机工程与应用 ›› 2011, Vol. 47 ›› Issue (15): 134-137.

• 数据库、信号与信息处理 • 上一篇    下一篇

朴素贝叶斯分类算法的改进及应用

张亚萍,陈得宝,侯俊钦,杨一军   

  1. 淮北师范大学 物理与电子信息学院,安徽 淮北 235000
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2011-05-21 发布日期:2011-05-21

Improvement and application of Naive Bayesian classification

ZHANG Yaping,CHEN Debao,HOU Junqin,YANG Yijun   

  1. School of Physics and Electronic Information,Huaibei Normal University,Huaibei,Anhui 235000,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2011-05-21 Published:2011-05-21

摘要: 针对朴素贝叶斯分类算法中缺失数据填补问题,提出一种基于改进EM(Expectation Maximization)算法的朴素贝叶斯分类算法。该算法首先根据灰色相关度对缺失数据一个估计,估计值作为执行EM算法的初始值,迭代执行E步M步后完成缺失数据的填补,然后用朴素贝叶斯分类算法对样本进行分类。实验结果表明,改进算法具有较高的分类准确度。并将改进的算法应用于高校教师岗位等级的评定。

关键词: 贝叶斯分类, EM算法, 缺失数据, 预测模型

Abstract: To solve the missing datas in Bayesian classification algorithm,a Naive classification algorithm based on Expectation Maximization(EM) is proposed.In the method,the missing datas is estimated with Grey Related Coefficient(GRC),then the estimated datas are chosen as the initial values of EM algorithm,the absent datas will be filled with iterating the EM algorithm in E and M steps.Finally,the samples are classified by Bayesian classification algorithm.Some experiments are used to show the effectiveness of the given algorithm,the results indicate that the improved algorithm has the higher precise of clustering compared with other Naive Bayesian classification algorithms.Moreover,the given methods are used to evaluation of professional titles of teachers in universities.

Key words: Naive Bayesian classification, Expectation Maximization(EM) algorithm, missing data, forecasting model