Computer Engineering and Applications ›› 2010, Vol. 46 ›› Issue (13): 118-121.DOI: 10.3778/j.issn.1002-8331.2010.13.035

• 数据库、信号与信息处理 • Previous Articles     Next Articles

Cluster-based evaluation in fuzzy-genetic data mining

ZHOU Li-juan,SHI Qian,GE Xue-bin,WANG Lin-shuang   

  1. College of Information Engineering,Capital Normal University,Beijing 100037,China
  • Received:2009-08-10 Revised:2009-10-13 Online:2010-05-01 Published:2010-05-01
  • Contact: ZHOU Li-juan

基于聚类的模糊遗传挖掘算法的研究

周丽娟,石 倩,葛学彬,王林爽   

  1. 首都师范大学 信息工程学院,北京 100037
  • 通讯作者: 周丽娟

Abstract: Through analysis of characteristics of continuous attributes and the existing mining algorithms based on association rules,the further research is conducted on the quantitative description accuracy and the algorithm efficiency,and a clustering-based fuzzy genetic association-rule mining algorithm is proposed,aiming to solve the “slow-speed” issue of the existing fuzzy genetic mining algorithm which computes the chromosome adaptive value based on the combination of large1-itemsets and membership-function suitability.This algorithm adopts fuzzy genetic principles,extracting association rules and membership-functions simultaneously from the transaction data.Meanwhile,it adopts k-means clustering algorithm to categorize the chromosomes of a particular species.And based on the categorization information and its own information,the algorithm evaluates the adaptability of each chromosome.The testing results indicate the fastness and accuracy of this algorithm.

Key words: cluster algorithm, association rules, fuzzy set, genetic algorithm, k-means algorithm

摘要: 通过分析连续型属性数据的特点和已有的关联规则挖掘算法,在定量描述的准确性和算法的高效性方面作了进一步研究,针对已有的通过结合最大一项集和隶属函数值去计算染色体的适应值的模糊遗传挖掘算法速度慢的问题,提出一种基于聚类的模糊遗传关联规则挖掘算法。该算法采用模糊遗传原理在交易数据中同时提取关联规则和隶属函数。同时,采用k-means聚类算法对种群中的染色体进行分类并且依据分类得到的信息和自身的信息评估每个染色体的适应性,从而降低了扫描数据库的次数,测试结果表明该算法速度快,准确度高。

关键词: 聚类算法, 关联规则, 模糊集, 遗传算法, k-means算法

CLC Number: