Computer Engineering and Applications ›› 2018, Vol. 54 ›› Issue (8): 7-13.DOI: 10.3778/j.issn.1002-8331.1801-0246

Previous Articles     Next Articles

Spark-based FP-Growth companion vehicles discovery and application

LIU Huihui, ZHANG Zuping, LONG Zhe   

  1. School of Information Science and Engineering, Central South University, Changsha 410083, China
  • Online:2018-04-15 Published:2018-05-02

基于Spark的FP-Growth伴随车辆发现与应用

刘惠惠,张祖平,龙  哲   

  1. 中南大学 信息科学与工程学院,长沙 410083

Abstract:

With the extensive application of big data technology in traffic management, it has aroused the researchers’ attention to detect the companion vehicles in the massive license plate data. However, most of the current methods are inefficient at large data volumes and remain in the theoretical research stage without being combined with practical applications. This paper presents a novel approach to this application. Using Spark distributed parallel computing framework to improve the running speed, the load balancing principle is used to equalize the data, and then a companion vehicle discovery algorithm based on the improved FP-Growth is proposed. The confidence is used to post-process the results, excluding the random companion situation, improving the detection accuracy. The method is applied to Changsha Traffic Police Major Traffic Control Center System, in which massive license plate recognition data is stored in Hive database under Hadoop big data platform, visualized on Police PGIS (Police Geographic Information System). The experiment proves the efficiency and feasibility.

Key words: companion vehicles, Spark computing framework, FP-Growth algorithm, random companion, slice companion

摘要: 随着大数据技术在交通管理中的广泛应用,在海量车牌数据中检测伴随车辆,引起了研究者们的关注。但目前大多数方法在庞大的数据量下运行效率低,且停留在理论研究阶段,并未与实际应用相结合。提出了一种针对于此应用的新颖方法。采用Spark分布式并行计算框架提高运行速度,利用负载均衡原理对数据进行均衡化处理,再提出基于改进的FP-Growth的伴随车辆发现算法,利用置信度对结果进行后处理,剔除车辆随机伴随的情况,提高了检测正确率。该方法应用于长沙市交警大联合交管中心系统,其中将海量车牌识别数据存储在Hadoop大数据平台下的Hive数据库中,在交通PGIS(Police Geographic Information System)上可视化分析结果,实验证明了方法的高效性和可行性。

关键词: 伴随车辆, Spark计算框架, FP-Growth算法, 随机伴随, 片伴随