计算机工程与应用 ›› 2019, Vol. 55 ›› Issue (4): 79-83.DOI: 10.3778/j.issn.1002-8331.1711-0405

• 大数据与云计算 • 上一篇    下一篇

基于Spark的指纹定位数据处理方法

陈熙宁1,2,马蔚吟3,李  力4   

  1. 1.上海交通大学 计算机科学与工程系,上海 200240
    2.贵州大学 贵州省公共大数据重点实验室,贵阳 550025
    3.南京医科大学 基础医学院,南京 211166
    4.上海交通大学 软件学院,上海 200240
  • 出版日期:2019-02-15 发布日期:2019-02-19

Fingerprint Localization Data Processing Method Based on Spark

CHEN Xining1,2, MA Weiyin3, LI Li4   

  1. 1.Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
    2.Guizhou Provincial Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China
    3.School of Basic Medical Sciences, Nanjing Medical University, Nanjing 211166, China
    4.School of Software, Shanghai Jiao Tong University, Shanghai 200240, China
  • Online:2019-02-15 Published:2019-02-19

摘要: 指纹定位技术是一种简单高效的无线定位技术,它不受无线信号多径效应和反射造成的干扰,具有较好的定位精度。然而指纹定位技术需要建立庞大的离线指纹数据库,随着指纹数据库规模的扩大,传统的指纹定位算法已经难以满足大数据应用中实时性的需求。结合指纹定位算法的特点和Spark计算引擎基于内存计算的优势,设计并实现了基于Spark的指纹定位数据处理方法。在Map阶段分别找到查询点在每个分区内的[K]近邻,在Reduce阶段规约各分区[K]近邻获得全局[K]近邻,最后通过加权求值获得最终的定位坐标。集群实验表明,基于Spark的指纹定位数据处理方法在一定并行度下有较好的加速比,在大规模指纹数据库下有实时定位处理的能力。

关键词: 无线定位技术, 指纹定位, Spark计算引擎, 加权KNN, 分布式计算

Abstract: Fingerprint localization is a simple and efficient wireless localization technology. It is free from interference caused by multipath effect and reflection of wireless signals, and achieves high accuracy. However, fingerprint localization technology requires a large offline fingerprint database for better accuracy. With the expansion of fingerprint database scale, traditional fingerprint localization algorithms have been difficult to meet the real-time requirements of large data applications. Therefore, combining fingerprint localization algorithm and in-memory computing engine Spark, a fingerprint localization data processing method based on Spark is designed and implemented. In the Map phase, the K neighbors of query point are found respectively. In the Reduce stage, the K neighbors of each partition are reduced to the global K nearest neighbors. Finally, the result coordinates are found by weighted mean. Cluster experiments show that the Spark based fingerprint localization data processing method has a good speedup in certain parallelism, and has the capability of real-time localization processing in large-scale fingerprint database.

Key words: wireless localization technology, fingerprint localization, Spark computing engine, weighted KNN, distributed computation