Computer Engineering and Applications ›› 2016, Vol. 52 ›› Issue (7): 264-270.

Previous Articles    

Recommendation algorithm for passengers based on Hadoop and trajectory data

JING Weipeng1, HU Likun1,2   

  1. 1.College of Information and Computer Engineering, The Northeast Forestry University, Harbin 150040, China
    2.Heilongjiang Province Engineering Technology Research Centre for Forestry Ecological Big Data Storage and High Performance (Cloud) Computing, Harbin 150040, China
  • Online:2016-04-01 Published:2016-04-19

基于Hadoop及出租车历史轨迹的乘客推荐算法

景维鹏1,2,胡立坤1,2   

  1. 1.东北林业大学 信息与计算机工程学院,哈尔滨 150040
    2.黑龙江省林业生态大数据存储与高性能(云)计算工程研究中心,哈尔滨 150040

Abstract: In order to improve the efficiency of passenger recommendation algorithm for??wisdom city, this paper uses classical probability to count the percentage of the empty taxi passing days in the total days to recommend the probability of passengers waiting for empty taxi, and uses least squares to fit arrival rate curve, namely, the relation curve between time and the number of empty taxis to predict the time passengers have to wait to take an empty taxi from the time that they reach the right road. To improve the efficiency of recommendation, the paper has done three more work:Select Hadoop as data storage and computing platform to improve the ability of data processing; put forward a new road network storage structure based on rasterizing maps to improve the search speed of the map; reform a map matching algorithm which based on computational geometry to improve the matching accuracy. Field testing experiments show that, the correct probability of empty taxi recommendation algorithm accuracy can reach 87% while the accuracy of waiting time recommendation algorithm is about  88.4 %, which shows the feasibility of mining trajectory data to provide recommendation service for passengers.

Key words: Hadoop, trajectory data, recommendation algorithm, recommendation service for passengers

摘要: 针对智慧城市中乘客打车策略的推荐算法效率不高的问题,使用古典概率学统计历史轨迹中该时间该路段有空车的天数占数据集总天数比例,作为乘客等到空车概率;使用最小二乘法拟合时间与到达空车数曲线,预测乘客等到空车时间,以提高推荐效率。同时,使用Hadoop作为数据存储和计算平台以提高数据处理能力;提出一种基于地图栅格化的路网存储结构来提高搜索地图速度;改进一种基于计算几何的地图匹配算法提高匹配准确率。实验结果显示,空车概率推荐算法正确率约87%,等待时间推荐算法正确率达88.4%,表明挖掘轨迹数据为乘客提供推荐服务的可行性。

关键词: Hadoop, 轨迹数据, 推荐算法, 乘客推荐服务