Computer Engineering and Applications ›› 2008, Vol. 44 ›› Issue (21): 222-224.DOI: 10.3778/j.issn.1002-8331.2008.21.060

• 机器学习 • Previous Articles     Next Articles

Bayesian classifier based on hypothesis testing

LI Jin-shan,WANG Zhi-hai,WANG Zhong-feng   

  1. Department of Computer and Information Technology,Beijing Jiaotong University,Beijing 100044,China
  • Received:2008-04-30 Revised:2008-06-20 Online:2008-07-21 Published:2008-07-21
  • Contact: LI Jin-shan

一种基于假设检验的贝叶斯分类器

李锦善,王志海,王中锋   

  1. 北京交通大学 计算机与信息技术学院,北京 100044
  • 通讯作者: 李锦善

Abstract: Classification is a main branch in Data Mining field.Bayesian classifier as an important technology in this branch has been widely used.Restricted Bayesian learning is a hotspot in these years.In this paper,a kind of hypothesis testing,called volume test is used to find the dependency between attributes.Based on these,propose a method of Bayesian classifier based on hypothesis testing,we call it Bayesian classifier based on Volume Test(BVT).It absorbs advantages of Naïve Bayes and idea of statistical hypothesis testing.Experiments show that this method outperforms Naïve Bayes,TAN,etc,especially when the dataset is large.

Key words: hypothesis testing, bayesian classifier, classification, machine learning

摘要: 分类是数据挖掘领域的重要分支,而贝叶斯分类方法作为分类领域的重要技术得到了日益广泛的研究和应用。限制性贝叶斯网络在不牺牲太多精确性的前提下简化网络结构,是近几年分类领域的研究热点。论文采用统计学中理论较成熟的体积假设检验(Volume Testing)方法寻找属性间的依赖关系,同时结合假设检验的思想和朴素贝叶斯分类算法的优点构造限制性贝叶斯网络,提出了一种基于假设检验的贝叶斯分类算法,并命名为基于体积检验的贝叶斯分类算法。在Weka系统下进行的实验,结果表明,这种方法效果优于朴素贝叶斯方法、TAN算法等,尤其对大数据集有更佳的表现效果。

关键词: 假设检验, 贝叶斯分类器, 分类, 机器学习