Computer Engineering and Applications ›› 2014, Vol. 50 ›› Issue (11): 58-61.
Previous Articles Next Articles
WANG Min, YIN Chao, WANG Zhihui, YAO Chenhong, GAO Jing
Online:
Published:
王 民,尹 超,王稚慧,要趁红,高 婧
Abstract: For random sampling is not representative, it proposes a robust parallel improvement of algorithms when using CURE algorithm to handle non-uniform mass data. It uses the Binary-Positive algorithm to get the effective properties of the data, uses valid data for hierarchical clustering with MapReduce, which is a distributed parallel framework. It achieves the correct rate and efficiency of a trade-off. The tests show that the improved CURE algorithm has a higher efficiency in the implementation and has a good clustering result.
Key words: clustering, Clustering Using Representative(CURE), Binary-Positive, MapReduce, parallel
摘要: 当CURE算法在处理不均匀的海量数据时,针对随机抽样不具有代表性的问题,提出了一种健壮的并行化改进算法。该算法使用Binary-Positive算法得到原始数据的有效属性,并利用MapReduce并行框架对有效数据进行层次聚类,从而实现了正确率与效率的一种权衡。实验分析表明,改进后的CURE算法具有更高的执行效率,且聚类效果良好。
关键词: 聚类, 利用代表点聚类(CURE), Binary-Positive, MapReduce, 并行
WANG Min, YIN Chao, WANG Zhihui, YAO Chenhong, GAO Jing. Parallel CURE algorithm with Binary-Positive[J]. Computer Engineering and Applications, 2014, 50(11): 58-61.
王 民,尹 超,王稚慧,要趁红,高 婧. Binary-Positive下的并行化CURE算法[J]. 计算机工程与应用, 2014, 50(11): 58-61.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://cea.ceaj.org/EN/
http://cea.ceaj.org/EN/Y2014/V50/I11/58