Computer Engineering and Applications ›› 2015, Vol. 51 ›› Issue (6): 27-32.

Previous Articles     Next Articles

Method of SVM model selection based on PAC-Bayes bound theory

TANG Li1,2, ZHAO Zheng2, GONG Xiujun2,3   

  1. 1.Department of Information Science & Technology, School of Science and Technology, Tianjin University of Finance and Economics, Tianjin 300222, China
    2.School of Computer Science and Technology, Tianjin University, Tianjin 300072, China
    3.Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin 300072, China
  • Online:2015-03-15 Published:2015-03-13

基于PAC-Bayes边界理论的SVM模型选择方法

汤  莉1,2,赵  政2,宫秀军2,3   

  1. 1.天津财经大学 理工学院 信息科学与技术系,天津 300222
    2.天津大学 计算机科学与技术学院,天津 300072
    3.天津市认知计算与应用重点实验室,天津 300072

Abstract: PAC-Bayes risk bound integrating theories of Bayesian paradigm and structure risk minimization for stochastic classifiers has been considered as a framework for effective evaluating the generalization capability of machine learning algorithms. Aiming at the problem of model selection of SVM, this paper analyzes the theoretical framework of PAC-Bayes bound and its application to SVM, and combines the PAC-Bayes bound with grid search method based on cross validation. A method of model selection based on PAC-Bayes bound(PBB-GS) is put forward to select the best penalty parameter and kernel parameter rapidly. From the experimental results of the UCI datasets, it draws the conclusion that the parameters selected by PBB-GS can make SVM achieve better generalization performance, and this method is simple, fast and accurate, which can improve the model selection of SVM effectively.

Key words: Probably Approximately Correct learning(PAC)-Bayes bound, Support Vector Machine(SVM), model selection, generalization capability

摘要: PAC-Bayes边界理论融合了贝叶斯定理和随机分类器的结构风险最小化原理,它作为一个理论框架,能有效评价机器学习算法的泛化性能。针对支持向量机(SVM)模型选择问题,通过分析PAC-Bayes边界理论框架及其在SVM上的应用,将PAC-Bayes边界理论与基于交叉验证的网格搜索法相结合,提出一种基于PAC-Bayes边界的SVM模型选择方法(PBB-GS),实现快速优选SVM的惩罚系数和核函数参数。UCI数据集的实验结果表明该方法优选出的参数能使SVM具有较高的泛化性能,并具有简便快速、参数选择准确的优点,能有效改善SVM模型选择问题。

关键词: 概率近似正确性学习(PAC)-贝叶斯边界, 支持向量机, 模型选择, 泛化性能