计算机工程与应用 ›› 2007, Vol. 43 ›› Issue (33): 184-187.

• 数据库与信息处理 • 上一篇    下一篇

加快SMO算法训练速度的策略研究

骆世广1,骆昌日2   

  1. 1.广东金融学院 应用数学系,广州 510521
    2.华中师范大学 网络学院,武汉 430079

  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-11-21 发布日期:2007-11-21
  • 通讯作者: 骆世广

Research on improving speed of SMO

LUO Shi-guang1,LUO Chang-ri2   

  1. 1.Department of Applied Mathematics,Guangdong University of Finance,Guangzhou 510521,China
    2.College of Network,Central China Normal University,Wuhan 430079,China
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-11-21 Published:2007-11-21
  • Contact: LUO Shi-guang

摘要: SMO(序贯最小优化算法)算法是目前解决支持向量机训练问题的一种十分有效的方法,但是当面对大样本数据时,SMO训练速度比较慢。考虑到在SVM的优化过程中并不是所有样本都能影响优化进展,提出了两种删除样本的策略:一种是基于距离,一种是基于拉格朗日乘子的值。在几个著名的数据集的试验结果表明,两种策略都可以大大缩短SMO的训练时间,特别适用于大样本数据。

关键词: 支持向量机, 序贯最小优化算法, Shrinking

Abstract: SMO(Sequential Minimal Optimization) algorithm is a very efficient method for training SVM.However,the training speed of SMO is very slow for the large-scale datasets.As only parts of the samples may affect the optimization of SVM,two strategies have been put forward for deleting the samples which are based on the distance and the value of Lagrange multiplies respectively.Experiments on several benchmark datasets have been done and the results show that the training time of the two strategies is reduced greatly,especially for the large-scale problems.

Key words: Support Vector Machine, Sequential Minimal Optimization algorithm, shrinking