Research on mining frequent itemsets in data streams

doi:10.3778/j.issn.1002-8331.2010.24.042

Computer Engineering and Applications ›› 2010, Vol. 46 ›› Issue (24): 138-140.DOI: 10.3778/j.issn.1002-8331.2010.24.042

• 数据库、信号与信息处理 • Previous Articles Next Articles

Research on mining frequent itemsets in data streams

MENG Cai-xia

Department of Computer Science，Xi’an University of Posts & Telecommunications，Xi’an 710065，China

Received:2009-02-12 Revised:2010-02-05 Online:2010-08-21 Published:2010-08-21
Contact: MENG Cai-xia

面向数据流的频繁项集挖掘研究

孟彩霞

西安邮电学院计算机科学系，西安 710061

通讯作者: 孟彩霞

Abstract

Abstract: According to the characteristic of data streams，the paper proposes FP-SegCount algorithm for mining frequent itemsets from data streams.The algorithm partitions the data stream and uses modified FP-growth algorithm to mining frequent itemsets in every segment.And then，it counts itemsets in Count Min Sketch.The algorithm solves the problem of compressed statistic and effective computation.Through experimentation and comparision with FP-DS algorithm，FP-SegCount algorithm has a good time efficiency.

摘要： 针对数据流的特点，对数据流中频繁模式挖掘问题进行了研究，提出了数据流频繁项集挖掘算法FP-SegCount。该算法将数据流分段并利用改进的FP-growth算法挖掘分段中的频繁项集。然后，利用Count Min Sketch进行项集计数。算法解决了压缩统计和计算快速高效的问题。通过和FP-DS算法的实验对比，FP-SegCount算法具有较好的时间效率。

CLC Number:

TP311.13

MENG Cai-xia. Research on mining frequent itemsets in data streams[J]. Computer Engineering and Applications, 2010, 46(24): 138-140.

孟彩霞. 面向数据流的频繁项集挖掘研究[J]. 计算机工程与应用, 2010, 46(24): 138-140.

[1]	SONG Guang-jun^1，2，HAO Zhong-xiao^1，3. Nearest neighbor query of line segment with uncertainty [J]. Computer Engineering and Applications, 2010, 46(33): 13-16.
[2]	WANG Ping-shui. Research on association rules mining algorithm [J]. Computer Engineering and Applications, 2010, 46(30): 115-116.
[3]	WANG Zheng-fei^1，2，WANG Wei³，SHI Bo-le³. Design and implementation of encrypted data in outsourced database [J]. Computer Engineering and Applications, 2010, 46(28): 141-145.
[4]	YIN Li-feng¹，HAO Zhong-xiao^1，2. Inference rules for XML strong multivalued dependencies based on XML Schema [J]. Computer Engineering and Applications, 2010, 46(28): 152-156.
[5]	GUO Xue-feng，HUANG Hui-xian，TANG Hong-zhong. Application of step-by-step filtering on short-term traffic flow prediction [J]. Computer Engineering and Applications, 2010, 46(27): 217-219.
[6]	ZHAO Er-feng^1，2，JIN Yi^1，2，YANG Yang^1，2，LI Jian-jun^1，2. Multi-dimensional data model design about dam safety monitoring system [J]. Computer Engineering and Applications, 2010, 46(25): 231-234.
[7]	JIANG Qun，WANG Yue，YAN He. Research on optimizing dynamic pricing based on evolutionary computation techniques [J]. Computer Engineering and Applications, 2010, 46(24): 229-232.
[8]	SONG Guang-jun^1，2，HAO Zhong-xiao^1，3，WANG Li-jie². Research on indexing of moving objects in road networks [J]. Computer Engineering and Applications, 2010, 46(22): 157-161.
[9]	DUAN Ming-xiu. Improved CLARA clustering algorithm based on SOFM algorithm [J]. Computer Engineering and Applications, 2010, 46(22): 210-212.
[10]	QIU Bao-zhi，JU Chang-tao. Study of boundary detecting technique with clustering [J]. Computer Engineering and Applications, 2010, 46(20): 133-137.
[11]	XIANG Jun. Scheduling algorithms of real-time transactions based on traversing a directed acyclic graph [J]. Computer Engineering and Applications, 2010, 46(19): 135-137.
[12]	ZHANG Yue-qin. Algorithm for mining frequent itemsets from sliding window over data streams [J]. Computer Engineering and Applications, 2010, 46(16): 132-134.
[13]	SUN Dong-pu¹，HAO Zhong-xiao^1，2. Survey on variant of nearest neighbor queries in spatio-temporal database [J]. Computer Engineering and Applications, 2010, 46(14): 12-16.
[14]	ZHU Xing-hui¹，ZHANG Lin-feng¹，TIAN Li². Communication efficient threshold monitoring method for large-scaled distributed systems [J]. Computer Engineering and Applications, 2010, 46(14): 98-102.
[15]	ZHOU Li-juan，SHI Qian，GE Xue-bin，WANG Lin-shuang. Cluster-based evaluation in fuzzy-genetic data mining [J]. Computer Engineering and Applications, 2010, 46(13): 118-121.

Research on mining frequent itemsets in data streams

面向数据流的频繁项集挖掘研究

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics