Parallel algorithm for computing incomplete information systems under big data

Abstract

Abstract: The lower and upper approximations are important concepts in rough set theory. Therefore, the computation of approximations is the basic for improving the massive data mining performance. Classical approximation space algorithm is infeasible for massive data, much less for massive data with missing information. To this end, through deep analysis of the characteristics of massive data with missing information, combining with the MapReduce programming model, a parallel algorithm for computing incomplete information systems using MapReduce is put forward to deal with the massive data with missing information. The experimental results demonstrate that the proposed parallel algorithm is effective.

Key words: MapReduce, data mining, massive data, rough set, incomplete information system, approximations

摘要： 上、下近似空间是粗糙理论的重要概念，解决上、下近似问题是海量数据挖掘的基础。经典的近似空间算法不适合处理海量数据，更不适合处理带缺失信息的海量数据问题。为此，通过深度分析带缺失信息的海量数据特征，结合MapReduce编程模型，提出了基于MapReduce框架下近似空间的并行算法，以处理带缺失信息的海量数据，实验结果表明了该并行算法的有效性。

关键词: MapReduce, 数据挖掘, 海量数据, 粗糙集, 不完备信息系统, 近似空间

JIANG Lin, MI Yunlong, WANG Tian. Parallel algorithm for computing incomplete information systems under big data[J]. Computer Engineering and Applications, 2014, 50(15): 101-106.

姜麟，米允龙，王添. 大数据下不完备信息系统近似空间的并行算法[J]. 计算机工程与应用, 2014, 50(15): 101-106.

[1]	ZONG Xiaoping, TAO Zeze. Knowledge Tracing Model Based on Mastery Speed [J]. Computer Engineering and Applications, 2021, 57(6): 117-123.
[2]	GAO Tianyu, WANG Qingrong, YANG Lei. Data Mining Model Based on Attribute Dependability Enhancement of Rough Set [J]. Computer Engineering and Applications, 2021, 57(3): 87-93.
[3]	WANG Qingrong, MA Chenkun. Forecast of Emergency Supplies for Case Consumption Reasoning [J]. Computer Engineering and Applications, 2021, 57(22): 281-287.
[4]	MA Yang, ZHAO Xujun. Multi-source Outlier Detection Algorithm Based on Relevant Subspace [J]. Computer Engineering and Applications, 2021, 57(17): 88-95.
[5]	ZHANG Nianpeng, WU Xu, ZHU Qiang. Entropy-Based Oversampling Framework [J]. Computer Engineering and Applications, 2021, 57(13): 96-101.
[6]	CHEN Yuanwen. Application of MapReduce Technology in Problem of Material Transportation and Stowage [J]. Computer Engineering and Applications, 2021, 57(12): 273-278.
[7]	ZHANG Bowen, LIU Zhi, SANG Guoming. Anomaly Detection Algorithm Based on Kernel Density Fluctuation [J]. Computer Engineering and Applications, 2021, 57(12): 132-136.
[8]	LIU Yufeng, SUN Wenxin. Generalized Multi-granulation Quantization Soft Rough Set Model [J]. Computer Engineering and Applications, 2021, 57(12): 137-143.
[9]	LIU Guizhi. Incremental Attribute Reduction of Incomplete Hybrid Data Based on Dimension Change [J]. Computer Engineering and Applications, 2021, 57(12): 161-169.
[10]	RAO Jiawang, MA Ronghua. Improved Kernel Density Estimator Based Spatial Point Density Algorithm [J]. Computer Engineering and Applications, 2021, 57(11): 260-265.
[11]	WANG Jie, CHEN Zhigang, LIU Jialing, CHENG Hongbing. Privacy Behavior Mining Technology for Cloud Computing Based on Clustering [J]. Computer Engineering and Applications, 2020, 56(5): 80-84.
[12]	ZHANG Bo, JIA Huayu, MA Jun. Estimation of Air Condition for Unmanned Aerial Vehicle Based on RS-GA Neural Network [J]. Computer Engineering and Applications, 2020, 56(4): 209-213.
[13]	MOU En, ZHANG Xianyong, YAO Yuesong, DENG Qie. Class-Specific Attribute Reduct and Its Heuristic Algorithm of Neighborhood Approximation Condition-Entropy [J]. Computer Engineering and Applications, 2020, 56(24): 175-180.
[14]	WANG Zilong, LI Jin, SONG Yafei. Improved K-means Algorithm Based on Distance and Weight [J]. Computer Engineering and Applications, 2020, 56(23): 87-94.
[15]	ZHANG Jianhua, LI Fangfang, YANG Lan. Research on Matching Supply and Demand of Case Knowledge Based on PFS and RS [J]. Computer Engineering and Applications, 2020, 56(23): 139-145.

Parallel algorithm for computing incomplete information systems under big data

大数据下不完备信息系统近似空间的并行算法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics