计算机工程与应用 ›› 2010, Vol. 46 ›› Issue (20): 129-132.DOI: 10.3778/j.issn.1002-8331.2010.20.037

• 人工智能 • 上一篇    下一篇

一种旅行数据约束关联规则挖掘算法

吴 斌,马 超   

  1. 北京邮电大学 计算机学院,北京 100876
  • 收稿日期:2010-04-14 修回日期:2010-05-28 出版日期:2010-07-11 发布日期:2010-07-11
  • 通讯作者: 吴 斌

Constrained association rule mining algorithm for travel data

WU Bin,MA Chao   

  1. School of Computer Science,Beijing University of Posts & Telecommunications,Beijing 100876,China
  • Received:2010-04-14 Revised:2010-05-28 Online:2010-07-11 Published:2010-07-11
  • Contact: WU Bin

摘要: 随着旅游业的发展,从海量旅行数据中挖掘旅客类型和环境因素之间内在的、隐含的相关性,是分析旅游市场状况、预测对相关行业影响的一种有效方法。结合旅行数据特点,并针对现有约束方法的局限性,提出一种基于关系延展路径约束的关联规则并行挖掘算法。该算法有效结合MapReduce并行机制,在关系延展路径约束下生成事务集,提升后续并行效率;同时利用并行方法改进Apriori算法的逐层搜索,带来“二次”效率提升,从而更好更快地把握旅游业发展动态,调整旅游业宏观政策。

关键词: 关系延展, 路径约束, 关联规则, 并行计算

Abstract: With rapid development of the tourism industry,an effective approach emerges to analyze tourism market and predict the influence on the relative industries,which builds upon mining various types of travelers and inherent,hidden relativity among different environmental factors from the gigantic quantity of industrial data.This paper proposes a new association rule algorithm by combining the unique characters of tourism data based on available algorithms.The algorithm is a parallel data-mining algorithm,which is constrained by the available association rule.Meanwhile,it is also restricted by the new association rule mentioned above,called the association-extended route constraint,which can solve problems the old association rule can not.The algorithm which makes the proper use of the“MapReduce”parallel mechanism,can produce item sets under the association-extended route rule,and increase the after-parallel efficiency.At the same time,it can optimize the iterative search of the“Apriori”algorithm,bringing in the“second”efficiency improvement.So we can control the whole tourism industry,and adapt the macro industrial strategies more appropriate.

Key words: association-extended, route constraint, association rule, parallel computing

中图分类号: