计算机工程与应用 ›› 2007, Vol. 43 ›› Issue (4): 190-193.

• 数据库与信息处理 • 上一篇    下一篇

MFPSM:基于双向约束的极大频繁页面集挖掘算法

任家东 张啸剑 彭慧丽   

  1. 燕山大学 信息科学与工程学院,河北 秦皇岛 燕山大学信息科学与工程学院 燕山大学 信息科学与工程学院,河北 秦皇岛
  • 收稿日期:2006-03-06 修回日期:1900-01-01 出版日期:2007-02-01 发布日期:2007-02-01
  • 通讯作者: 张啸剑

MFPSM: An Efficient Algorithm for Mining Maximum Frequent PageSets Based on bi-directional Constraint

  • Received:2006-03-06 Revised:1900-01-01 Online:2007-02-01 Published:2007-02-01

摘要: 挖掘极大频繁页面集是WEB使用挖掘中的关键应用之一。由于一定时间段的会话中蕴含着用户的访问模式与访问动机,设计一种结点带有驻留时间,类似FP-tree的频繁页面树FPDT-Tree结构。利用FPDT-Tree结构存储双向驻留时间约束的会话数据库,简化挖掘过程中驻留时间阈值的设置。基于FPDT-Tree提出算法MFPSM挖掘会话中的极大频繁页面集。实验结果表明,在时间约束环境中,通过决策者给出合适的时间约束阈值,该算法可以有效地缩短挖掘极大频繁页面集的时间。

关键词: 极大频繁页面集, 会话, 驻留时间, 频繁页面树

Abstract: Mining maximum frequent page set is a key to web usage mining applications, since user’s traversal patterns and motivation are latent in sessions at some time segment. According to FP-tree, a Frequent Page Tree structure with Dwell Time (abbreviated as FPDT-Tree) is designed. Utilizing FPDT-Tree to store, compress the session database that is constrained by the bi-directional dwell time, and simplify the configuration of dwell time thresholds in mining. A new algorithm called Maximum Frequent Page Set Mining is presented, which traversals quickly FPDT-Tree and discovers maximum frequent page set from the sessions. In the time constraint environment, experimental results show that MFPSM can significantly improve the execution time as long as the decision-maker’s (user) gives the appropriate dwell time constraints.

Key words: Maximum frequent page set, Session, Dwell time, FPDT-Tree