Research on document classification based on k-means and Support Vector Machine

doi:10.3778/j.issn.1002-8331.2010.22.051

Computer Engineering and Applications ›› 2010, Vol. 46 ›› Issue (22): 172-174.DOI: 10.3778/j.issn.1002-8331.2010.22.051

• 数据库、信号与信息处理 • Previous Articles Next Articles

Research on document classification based on k-means and Support Vector Machine

JIA Yan-hua，XU Wei-hong

Department of Computer and Communication，Changsha University of Science and Technology，Changsha 410004，China

Received:2009-01-13 Revised:2009-03-24 Online:2010-08-01 Published:2010-08-01
Contact: JIA Yan-hua

K-means聚类和支持向量机结合的文本分类研究

贾燕花，徐蔚鸿

长沙理工大学计算机与通信工程学院，长沙 410004

通讯作者: 贾燕花

Abstract

Abstract: Aming to documents classification in data mining，a classification method based on k-means and Support Vector Machine（SVM） is proposed.The documents are clustered k kinds by this method，then they are classified detailedly by SVM.The multilayer linked SVM model that can classify the samples set into multicategories is constructed.The method is shown about how the model is constructed and applied to classification and the practicability is also illustrated.

Key words: document classification, k-means algorithm, clustering, Support Vector Machine（SVM）

摘要： 针对数据挖掘中文本自动分类问题，提出了一种基于k-means聚类算法和支持向量机相结合的文本分类方法。该方法先将文本大致聚为k类，然后对每一类用支持向量机进行细分。构造了可用于多个模式类识别的多层SVM模型，该模型可完成对多个模式的分类识别。给出了该模型的构造及应用的方法，并验证了该方法的有效性。

关键词: 文本分类, k-means算法, 聚类, 支持向量机

CLC Number:

TP391

JIA Yan-hua，XU Wei-hong. Research on document classification based on k-means and Support Vector Machine[J]. Computer Engineering and Applications, 2010, 46(22): 172-174.

贾燕花，徐蔚鸿. K-means聚类和支持向量机结合的文本分类研究[J]. 计算机工程与应用, 2010, 46(22): 172-174.

[1]	LAN Hong, HUANG Min. Fusion of KNN Optimized Density Peaks and FCM Clustering Algorithm [J]. Computer Engineering and Applications, 2021, 57(9): 81-88.
[2]	GUO Xiaojing, SUI Haoda. Application of Improved YOLOv3 in Foreign Object Debris Target Detection on Airfield Pavement [J]. Computer Engineering and Applications, 2021, 57(8): 249-255.
[3]	LI Li, JI Xinyuan, SONG Song. Prediction Model for Number of Software Defects in Loop [J]. Computer Engineering and Applications, 2021, 57(7): 158-163.
[4]	YANG Fang, YIN Xi, SI Jianhui, LIU Hongyuan, WANG Xue. Mathematical Expression Similarity Calculation Method Based on Focus Clustering [J]. Computer Engineering and Applications, 2021, 57(6): 88-93.
[5]	ZHAO Fan, ZHANG Lin, WEN Zhiquan, YANG Linlin, LIN Guangfeng. Direct and Efficient Natural Scene Chinese Character Approaching Spotting Method [J]. Computer Engineering and Applications, 2021, 57(6): 159-167.
[6]	HAN Weiyu, CHENG Longsheng. Research on Roling Bearing Failure Mode Classification Based on MTS and SVM [J]. Computer Engineering and Applications, 2021, 57(6): 239-246.
[7]	HUO Guangyu, ZHANG Yong, SUN Yanfeng, YIN Baocai. Research on Archive Data Intelligent Classification Based on Semantic [J]. Computer Engineering and Applications, 2021, 57(6): 247-253.
[8]	PENG Qihui, XUAN Shibin, GAO Qing. Distribution Automatic Threshold Density Peak Clustering Algorithm [J]. Computer Engineering and Applications, 2021, 57(5): 71-78.
[9]	LI Yongzhen, LIAO Husheng. Multi-view Clustering via Graph Convolutional Neural Network [J]. Computer Engineering and Applications, 2021, 57(5): 115-122.
[10]	WANG Changlong, ZHANG Yuandong, MIAO Hong, YANG Yuheng. Application of Double Channel Convolutional Neural Network in Pumpkin Diseases Identification [J]. Computer Engineering and Applications, 2021, 57(5): 183-189.
[11]	HU Xiaomin, WANG Mingfeng, ZHANG Shourong, LI Min. New Differential Evolution with Particle Swarm Optimization Algorithm for Text Clustering [J]. Computer Engineering and Applications, 2021, 57(4): 61-67.
[12]	WEN Jiebin, YANG Wenzhong, MA Guoxiang, ZHANG Zhihao, LI Hailei. Micro-expression Recognition Based on Apex Frame Optical Flow and Convolutional Autoencoder [J]. Computer Engineering and Applications, 2021, 57(4): 127-133.
[13]	WANG Junling, LU Xinming. Video Key Frame Extraction Algorithm Based on Semantic Correlation [J]. Computer Engineering and Applications, 2021, 57(4): 192-198.
[14]	LI Junxia, ZHANG Qin, ZHENG Guimei. Overview of Human Posture Recognition by Ultra-wideband Radar [J]. Computer Engineering and Applications, 2021, 57(3): 14-23.
[15]	WANG Fuyin, ZHANG Desheng, ZHANG Xiao. Adaptive Density Peaks Clustering Algorithm Combining with Whale Optimization Algorithm [J]. Computer Engineering and Applications, 2021, 57(3): 94-102.

Research on document classification based on k-means and Support Vector Machine

K-means聚类和支持向量机结合的文本分类研究

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics