计算机工程与应用 ›› 2011, Vol. 47 ›› Issue (2): 136-138.DOI: 10.3778/j.issn.1002-8331.2011.02.042

• 数据库、信号与信息处理 • 上一篇    下一篇

改进的结合密度聚类的SVM快速分类方法

张珍珍,董才林,陈增照,何秀玲   

  1. 华中师范大学 数学与统计学学院,武汉 430079
  • 收稿日期:2009-05-07 修回日期:2009-07-06 出版日期:2011-01-11 发布日期:2011-01-11
  • 通讯作者: 张珍珍

Improved fast classifier based on SVM and density clustering

ZHANG Zhenzhen,DONG Cailin,CHEN Zengzhao,HE Xiuling   

  1. School of Mathematics and Statistics,Huazhong Normal University,Wuhan 430079,China
  • Received:2009-05-07 Revised:2009-07-06 Online:2011-01-11 Published:2011-01-11
  • Contact: ZHANG Zhenzhen

摘要: 针对SVM在对大规模数据分类时求解规模过大的问题,提出了一种缩减数据集以提高训练速度的方法。该算法的第一步利用基于密度的方法大致定位能代表某个局域的质点,然后用SVM训练缩减后的数据得到一组支持向量,第二步的训练数据由支持向量以及其所代表的样本点构成。仿真实验证明该算法在保证分类准确率的情况下能有效地提高分类速度。

关键词: 密度聚类, SVM算法, 快速分类, 大数据集

Abstract: In order to resolve the problem of actual large-scale data sets classification using SVM,this paper provides a method to improve the training speed through reducing data sets.The algorithm is divided into two steps.Firstly,it finds the samples that can represent a similar regional utilizing density clustering,then a set of support vectors will be gotten after using SVM to train reduced data sets.The second step is to find a new train data set assembled some support vectors and samples which belong to the regional represented by the support vector.The simulation shows that the algorithm proposed in this paper can improve classification speed while accuracy rate is acceptable.

Key words: density clustering, Support Vector Machine(SVM) algorithm, fast classification, large data sets

中图分类号: