Computer Engineering and Applications ›› 2008, Vol. 44 ›› Issue (14): 155-158.

• 数据库、信号与信息处理 • Previous Articles     Next Articles

Improved mining method for association rules based on genetic algorithm

LI Feng-ying1,ZHAO Lian-peng1,2,WANG Hong-yu3   

  1. 1.Bohai University,Jinzhou,Liaoning 121000,China
    2.School of Computer,Northeast Normal University,Changchun 130117,China
    3.Jinzhou Branch,National Railway Tank Car Volume Measuring Station,Jinzhou,Liaoning 121000,China
  • Received:2007-10-15 Revised:2008-01-17 Online:2008-05-11 Published:2008-05-11
  • Contact: LI Feng-ying

一种基于遗传算法的关联规则改进方法

李凤营1,赵连朋1,2,王红雨3   

  1. 1.渤海大学,辽宁 锦州 121000
    2.东北师范大学 计算机学院,长春 130117
    3.国家铁路罐车容积计量站 锦州分站,辽宁 锦州 121000
  • 通讯作者: 李凤营

Abstract: This paper proposes the penalty function by setting support threshold and an improved genetic algorithm,based on the mechanism analysis of redundancy problem production.The algorithm makes chromosome always mining in the concentrated area of frequent item by using some new technologies such as frequent item distribution,primes factor coding,spouse and sharing function,and thus combination space is validly pruned.Moreover,because the numerical conversion is used for the transaction,the storage space of transaction database is validly compressed and the operation speed is improved.Experiment results show that the improved mining method of the paper has certain advantage on the efficiency and precision of finding the valuable rules.

Key words: association rules, genetic algorithm, frequent item distribution, primes factor coding, spouse

摘要: 在对关联规则冗余问题产生机理分析的基础上,提出了针对于支持度阀值设置的惩罚函数和一个改进的遗传算法。该改进算法采用了频繁项分布、素因子编码、择偶和共享函数等新颖技术,使染色体总是能在频繁项密集区进行挖掘,从而对组合搜索空间进行了有效修剪。并且对事务进行了数值转换,有效地压缩了事务数据库存储空间,提高了运算速度。从实验效果来看,改进的挖掘方法在发现有价值规则的效率与精准率方面具有一定优势。

关键词: 关联规则, 遗传算法, 频繁项分布, 素因子编码, 择偶