Computer Engineering and Applications ›› 2010, Vol. 46 ›› Issue (22): 172-174.DOI: 10.3778/j.issn.1002-8331.2010.22.051

• 数据库、信号与信息处理 • Previous Articles     Next Articles

Research on document classification based on k-means and Support Vector Machine

JIA Yan-hua,XU Wei-hong   

  1. Department of Computer and Communication,Changsha University of Science and Technology,Changsha 410004,China
  • Received:2009-01-13 Revised:2009-03-24 Online:2010-08-01 Published:2010-08-01
  • Contact: JIA Yan-hua

K-means聚类和支持向量机结合的文本分类研究

贾燕花,徐蔚鸿   

  1. 长沙理工大学 计算机与通信工程学院,长沙 410004
  • 通讯作者: 贾燕花

Abstract: Aming to documents classification in data mining,a classification method based on k-means and Support Vector Machine(SVM) is proposed.The documents are clustered k kinds by this method,then they are classified detailedly by SVM.The multilayer linked SVM model that can classify the samples set into multicategories is constructed.The method is shown about how the model is constructed and applied to classification and the practicability is also illustrated.

Key words: document classification, k-means algorithm, clustering, Support Vector Machine(SVM)

摘要: 针对数据挖掘中文本自动分类问题,提出了一种基于k-means聚类算法和支持向量机相结合的文本分类方法。该方法先将文本大致聚为k类,然后对每一类用支持向量机进行细分。构造了可用于多个模式类识别的多层SVM模型,该模型可完成对多个模式的分类识别。给出了该模型的构造及应用的方法,并验证了该方法的有效性。

关键词: 文本分类, k-means算法, 聚类, 支持向量机

CLC Number: