Research on Mining Association Rules Based on Multi-Granularity Attribute Reduction

doi:10.3778/j.issn.1002-8331.1712-0254

Abstract

Abstract: In the era of big data, it has become increasingly difficult to obtain the data. And data mining is the key technology to solve this problem at present. Apriori algorithm is a common algorithm in data mining by mining potential association rules behind the data. Considering the problems of traditional Apriori algorithm, such as frequent scan data and cumbersome acquisition of candidate items, a weighted Apriori algorithm is proposed to record the number of repetitions of the total number of records. The repetition times are taken as the weight and compression matrix of data sets. Binary Boolean matrix is used to replace the original data set, through the matrix of “AND operation” to obtain the maximum frequent item set to reduce the time complexity. Considering the redundancy of the original data and the inaccuracy of attribute reduction, an algorithm of attribute reduction based on multi-granularity rough set is proposed before the association rules are extracted. The uncertainty of the information is described by the granularity of knowledge, and the attribute value is refined to reduce the precision and reduce the space complexity. Finally, the proposed algorithm is compared with the Apriori algorithm based on frequent matrices and the original Apriori algorithm to verify its practicability and validity.

Key words: multi-granularity rough set, attribute reduction, binary, weighted Apriori algorithm

摘要： 大数据时代，人们获取所需信息的困难度提高，而数据挖掘是当下解决此问题的关键技术。Apriori算法作为数据挖掘中的常用算法，通过挖掘数据背后的潜在关联规则。考虑到传统Apriori算法执行过程中，数据扫描频繁、候选集获取繁琐等问题，提出采用加权Apriori算法，即将冗余记录存储一次，并将记录的重复次数占全部记录数的比值作为权重，压缩空间；采用二进制的布尔矩阵替代原有数据集，通过矩阵内部“与运算”，获取最大频繁集，降低时间复杂度。考虑到原始数据冗余性以及粗糙集属性约简的不精确性，在提取关联规则前，提出采用多粒度粗糙集的属性约简算法，通过知识粒度细化属性值来提高约简精度，降低空间复杂度。最后，将所提方法与基于频繁矩阵的Apriori算法以及原始Apriori算法进行比较，验证所提方法的实用性和有效性。

关键词: 多粒度粗糙集, 属性约简, 二进制, 加权Apriori算法

YANG Zhen, GENG Xiuli. Research on Mining Association Rules Based on Multi-Granularity Attribute Reduction[J]. Computer Engineering and Applications, 2019, 55(6): 133-139.

杨珍，耿秀丽. 考虑多粒度属性约简的关联规则挖掘研究[J]. 计算机工程与应用, 2019, 55(6): 133-139.

[1]	WANG Changlong, ZHANG Yuandong, MIAO Hong, YANG Yuheng. Application of Double Channel Convolutional Neural Network in Pumpkin Diseases Identification [J]. Computer Engineering and Applications, 2021, 57(5): 183-189.
[2]	DAI Qi, LI Min, LIU Yang, LI Lihong. Fast Attribute Reduction Algorithm Based on Fuzzy Hierarchical Quotient Space [J]. Computer Engineering and Applications, 2021, 57(4): 55-60.
[3]	GAO Tianyu, WANG Qingrong, YANG Lei. Data Mining Model Based on Attribute Dependability Enhancement of Rough Set [J]. Computer Engineering and Applications, 2021, 57(3): 87-93.
[4]	CAI Xiumei, BIAN Jingwei, WU Chengmao, WANG Yan. Research on Robust Feature Extraction and Matching Methods Based on LBP [J]. Computer Engineering and Applications, 2021, 57(16): 228-236.
[5]	ZHANG Chengling, LI Jinjin, LIN Yidong. Attribute Reduction in Formal Contexts Based on OE-Concept Lattices [J]. Computer Engineering and Applications, 2021, 57(15): 82-89.
[6]	LIU Guizhi. Incremental Attribute Reduction of Incomplete Hybrid Data Based on Dimension Change [J]. Computer Engineering and Applications, 2021, 57(12): 161-169.
[7]	ZHANG Ren, HE Ning. A Survey of Micro-Expression Recognition Methods [J]. Computer Engineering and Applications, 2021, 57(1): 38-47.
[8]	HUO Lin, LU Yinli. Improved Particle Swarm Optimization for Android Malware Detection [J]. Computer Engineering and Applications, 2020, 56(7): 96-101.
[9]	MOU En, ZHANG Xianyong, YAO Yuesong, DENG Qie. Class-Specific Attribute Reduct and Its Heuristic Algorithm of Neighborhood Approximation Condition-Entropy [J]. Computer Engineering and Applications, 2020, 56(24): 175-180.
[10]	ZHANG Guangpian, JI Zhongping. Method of 3D Human Body Modeling Based on 2D Point Cloud Image [J]. Computer Engineering and Applications, 2020, 56(19): 205-215.
[11]	HUANG Xuebo, XU Zhengguo, YAN Jikun. High-Frequency Similar Sequence Extraction Algorithm of Protocol Data Based on Simhash [J]. Computer Engineering and Applications, 2020, 56(16): 199-203.
[12]	WANG Tong, ZHU Minling. Study on Fast Realization of Serial Test and Approximate Entropy Test [J]. Computer Engineering and Applications, 2020, 56(15): 113-117.
[13]	DAI Peiwu, PAN Zulie, SHI Fan. Research on Crash Classification for Vulnerability Types [J]. Computer Engineering and Applications, 2020, 56(13): 124-130.
[14]	LI Xu, RONG Zijing, REN Yan. Attribute Reduction on Weighted Decision Table [J]. Computer Engineering and Applications, 2020, 56(12): 54-59.
[15]	CHEN Panpan, LIN Menglei. Attributes Reduction of Single Valued Neutrosophic Decision Information System Based on Inclusion Degree [J]. Computer Engineering and Applications, 2020, 56(12): 175-181.

Research on Mining Association Rules Based on Multi-Granularity Attribute Reduction

考虑多粒度属性约简的关联规则挖掘研究

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics