Semi-supervised clustering method with constrains

doi:10.3778/j.issn.1002-8331.2009.22.033

Computer Engineering and Applications ›› 2009, Vol. 45 ›› Issue (22): 100-102.DOI: 10.3778/j.issn.1002-8331.2009.22.033

• 数据库、信息处理 • Previous Articles Next Articles

Semi-supervised clustering method with constrains

LIU Ying-dong

School of Traffic and Transportation，Lanzhou Jiaotong University，Lanzhou 730070，China

Received:2008-10-30 Revised:2009-01-14 Online:2009-08-01 Published:2009-08-01
Contact: LIU Ying-dong

有约束的半监督聚类方法

刘应东

兰州交通大学交通运输学院，兰州 730070

通讯作者: 刘应东

Abstract

Abstract: In many data mining domains，there is a large supply of unlabeled data but limited labeled data，which can be expensive to generate.Consequently，semi-supervised clustering，which uses a small amount of labeled data to aid unlabeled clustering，has become a topic of significant recent interest.This paper presents a new algorithm，called semi-supervised clustering algorithm based on constrains learning，which obtains the similarity and dissimilarity criterions of data objects，adjusts them in the process of clustering，and uses them to constrain and supervise clustering.Demonstrated the clustering algorithm with Gaussian dataset，and the experimental results confirm that the clustering algorithm significantly improves the accuracy and speed of clustering when given a relatively small amount of supervision.

Key words: data mining, labeled data, constrains, semi-supervised clustering

摘要： 在数据挖掘领域的很多实际应用中，获取大量的无标签样本非常容易，而获取有标签的样本通常需要付出较大的代价，并且有时不可能得到所有的数据的标签，半监督聚类就是使用一小部分的标签数据对无标签数据的聚类过程进行指导。提出了一种新的半监督聚类算法，它利用标签数据提供的信息来初步确定数据的相似性和不相似性标准，并在聚类过程中对其进行自动调整，利用它们对聚类过程进行约束和指导。通过在标准数据集高斯数据集上的测试，该算法相对于无指导聚类来说有更高的精度和更快的速度。

关键词: 数据挖掘, 标签数据, 约束, 半监督聚类

LIU Ying-dong. Semi-supervised clustering method with constrains[J]. Computer Engineering and Applications, 2009, 45(22): 100-102.

刘应东. 有约束的半监督聚类方法
[J]. 计算机工程与应用, 2009, 45(22): 100-102.

[1]	ZONG Xiaoping, TAO Zeze. Knowledge Tracing Model Based on Mastery Speed [J]. Computer Engineering and Applications, 2021, 57(6): 117-123.
[2]	GAO Tianyu, WANG Qingrong, YANG Lei. Data Mining Model Based on Attribute Dependability Enhancement of Rough Set [J]. Computer Engineering and Applications, 2021, 57(3): 87-93.
[3]	MA Yang, ZHAO Xujun. Multi-source Outlier Detection Algorithm Based on Relevant Subspace [J]. Computer Engineering and Applications, 2021, 57(17): 88-95.
[4]	ZHANG Nianpeng, WU Xu, ZHU Qiang. Entropy-Based Oversampling Framework [J]. Computer Engineering and Applications, 2021, 57(13): 96-101.
[5]	ZHANG Bowen, LIU Zhi, SANG Guoming. Anomaly Detection Algorithm Based on Kernel Density Fluctuation [J]. Computer Engineering and Applications, 2021, 57(12): 132-136.
[6]	RAO Jiawang, MA Ronghua. Improved Kernel Density Estimator Based Spatial Point Density Algorithm [J]. Computer Engineering and Applications, 2021, 57(11): 260-265.
[7]	HAN Song, HAN Qiuhong. Review of Semi-Supervised Learning Research [J]. Computer Engineering and Applications, 2020, 56(6): 19-27.
[8]	WANG Jie, CHEN Zhigang, LIU Jialing, CHENG Hongbing. Privacy Behavior Mining Technology for Cloud Computing Based on Clustering [J]. Computer Engineering and Applications, 2020, 56(5): 80-84.
[9]	WANG Zilong, LI Jin, SONG Yafei. Improved K-means Algorithm Based on Distance and Weight [J]. Computer Engineering and Applications, 2020, 56(23): 87-94.
[10]	JI Wenlu, WANG Hailong, SU Guibin, LIU Lin. Review of Recommendation Methods Based on Association Rules Algorithm [J]. Computer Engineering and Applications, 2020, 56(22): 33-41.
[11]	YI Junyan, WU Boya, YONG Qiaoling. Research on Clustering Algorithm of Elastic Net with Weighted Characteristics [J]. Computer Engineering and Applications, 2020, 56(22): 55-65.
[12]	LIU Wenfen, MU Xiaodong, HUANG Yuehua. Anomaly Detection Method Based on Multi-resolution Grid [J]. Computer Engineering and Applications, 2020, 56(17): 78-85.
[13]	MENG Haidong1，2, SUN Xinjun2, SONG Yuchen1. Improved LOF Algorithm Based on Data Field [J]. Computer Engineering and Applications, 2019, 55(3): 154-158.
[14]	GONG Yanlu, LV Jia. Co-Training Method Combined with Semi-Supervised Clustering and Weighted [K]-Nearest Neighbor [J]. Computer Engineering and Applications, 2019, 55(22): 114-118.
[15]	LEI Le, WANG Lizhen, XIAO Qing. Study on Fuzzy Mining Technology in Spatial Co-Location Pattern Mining [J]. Computer Engineering and Applications, 2019, 55(21): 158-166.

Semi-supervised clustering method with constrains

有约束的半监督聚类方法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics