Computer Engineering and Applications ›› 2017, Vol. 53 ›› Issue (11): 90-94.DOI: 10.3778/j.issn.1002-8331.1603-0290

Previous Articles     Next Articles

Data warehouse of QAR based on Hive

FENG Xingjie, WU Xiyu, ZHAO Jie, HE Yang, FANG Shu   

  1. College of Computer Science & Technology, Civil Aviation University of China, Tianjin 300300, China
  • Online:2017-06-01 Published:2017-06-13

QAR数据仓库在Hive中的构建

冯兴杰,吴稀钰,赵  杰,贺  阳,房  戍   

  1. 中国民航大学 计算机科学与技术学院,天津 300300

Abstract: QAR data analysis is an effective method for detecting the state of the aircraft. However, with the rapid development of civil aviation, the scale of QAR is increasing rapidly. The existing QAR data warehouse based on relational database is not sufficient to support massive data storage and analysis, resulting in massive data into the information garbage. In this paper, to solve the deficiency of existing data warehouse, it proposes the QAR data warehouse based on Hive. Based on the analysis of Hive features and QAR data structure, the overall architecture and storage structure of QAR data warehouse based on Hive are designed. By porting the data in the existing data warehouse to the QAR data warehouse based on Hive, it can realize the compatibility of existing data warehouse. Experimental results show that the QAR data warehouse based on Hive in the face of the sharp increase of QAR data processing time maintains a linear growth.

Key words: Hive, Quick Access Recorder(QAR), data warehouse, data processing, Hadoop

摘要: 分析QAR数据是一种非常有效的监控飞机状态的方法。但随着民航领域的快速发展,QAR数据的规模急剧增大,现有基于关系型数据库的QAR数据仓库不足以支撑海量数据下的存储与分析,导致海量的QAR数据因无法处理变成了信息垃圾。因此,针对现有数据仓库的不足,提出基于Hive的QAR数据仓库。通过对Hive特点及QAR数据结构分析,设计了基于Hive的QAR数据仓库的总体架构和存储结构。通过将现有数据仓库中的数据移植到基于Hive的QAR数据仓库,实现了对已有数据仓库的兼容。实验结果表明基于Hive的QAR数据仓库在面对QAR数据剧增的情况下,处理所需时间依然保持着线性增长。

关键词: Hive, 快速存取记录器(QAR), 数据仓库, 数据处理, Hadoop