计算机工程与应用 ›› 2017, Vol. 53 ›› Issue (6): 85-90.DOI: 10.3778/j.issn.1002-8331.1508-0105

• 大数据与云计算 • 上一篇    下一篇

一种基于朴素贝叶斯算法的OLAP缓存机制

满  毅,章炯民,徐晓锦   

  1. 华东师范大学 计算机科学与技术学院,上海 200241
  • 出版日期:2017-03-15 发布日期:2017-05-11

OLAP cache mechanism based on naive Bayesian

MAN Yi, ZHANG Jiongmin, XU Xiaojin   

  1. School of Computer Science and Technology, East China Normal University, Shanghai 200241, China
  • Online:2017-03-15 Published:2017-05-11

摘要: 大数据时代,缓存作为一种提高数据处理性能的有效技术而被广泛研究。目前大多数缓存机制将查询结果以文件的形式保存了下来,命中率较低,造成了缓存资源的浪费。以国内外的缓存技术为基础,结合用户的查询习惯,借助增量朴素贝叶斯算法设计了一种新的数据仓库缓存机制,此缓存机制可根据用户的操作习惯判断每次查询的结果是否需要被缓存,以此提高缓存命中率。并通过实验从平均查询时间以及缓存命中率两方面验证了该缓存机制的有效性。

关键词: 联机分析处理(OLAP), 缓存, 联机分析处理(OLAP)缓存, 朴素贝叶斯算法, 缓存机制, 数据仓库

Abstract: In the era of the big data, cache can be seen as one of the most effective ways to enhance data processing technique, and therefore it is widely researched. The majority of cache mechanism saves the query results as the file, thus there is nearly no way to reuse the partial data in the cache under specific situations, and consequently cache resources are wasted. Based on learning the cache techniques both here and abroad, this project designs one data warehouse cache mechanism by using incremental learning naive Bayesian algorithm. This cache mechanism can decide whether to cache the current query results according to users’ recent operations, and ultimately can increase the hit rate of cache. Finally, the results of the experiment illustrate the effectiveness and efficiency of this cache mechanism by analyzing both average query time and the hit rate of cache.

Key words: On-Line Analytical Processing(OLAP), cache, On-Line Analytical Processing(OLAP) cache, naive Bayesian algorithm, caching mechanism, data warehouse