计算机工程与应用 ›› 2011, Vol. 47 ›› Issue (4): 131-133.DOI: 10.3778/j.issn.1002-8331.2011.04.036

• 数据库、信号与信息处理 • 上一篇    下一篇

基于OC-SVM的大型数据集分类方法

张 瑜,罗 可   

  1. 长沙理工大学 计算机与通信工程学院,长沙 410076
  • 收稿日期:2009-05-18 修回日期:2009-07-15 出版日期:2011-02-01 发布日期:2011-02-01
  • 通讯作者: 张 瑜

OC-SVM-based classification for large-scale data sets

ZHANG Yu,LUO Ke   

  1. Institute of Computer and Communication Engineering,Changsha University of Science and Technology,Changsha 410076,China
  • Received:2009-05-18 Revised:2009-07-15 Online:2011-02-01 Published:2011-02-01
  • Contact: ZHANG Yu

摘要: 支持向量机是最有效的分类技术之一,具有很高的分类精度和良好的泛化能力,但其应用于大型数据集时的训练过程还是非常复杂。对此提出了一种基于单类支持向量机的分类方法。采用随机选择算法来约简训练集,以达到提高训练速度的目的;同时,通过恢复超球体交集中样本在原始数据中的邻域来保证支持向量机的分类精度。实验证明,该方法能在较大程度上减小计算复杂度,从而提高大型数据集中的训练速度。

关键词: 单类支持向量机, 随机选择, 支持向量机分类, 大型数据集

Abstract: Support Vector Machine(SVM) is one of the most effective classifiers,which has very high classification accuracy and good generalization ability.However,SVM training yet is very complicated for large-scale data sets.A one-class SVM(OC-SVM)-based classification is proposed.In this method,training sets are reduced by the random selection algorithm to increase training speed.Meanwhile,original data which are the neighbors of samples in hypersphere intersection are recovered to ensure SVM classification accuracy.The experimental results show that the method reduces the computational complexity at a great extent,so the training speed is improved for large-scale data sets.

Key words: One-Class Support Vector Machine(OC-SVM), random selection, Support Vector Machine(SVM) classification, large-scale data sets

中图分类号: