Prediction of probability of hitting vacant taxi and waiting time based on empirical distribution

Computer Engineering and Applications ›› 2015, Vol. 51 ›› Issue (24): 254-259.

Previous Articles Next Articles

Prediction of probability of hitting vacant taxi and waiting time based on empirical distribution

WANG Zhaoyuan1，2, LI Tianrui1，2, CHENG Yao3, WANG Yue1，2, YI Xiuwen1，2

1.School of Information Science and Technology, Southwest Jiaotong University, Chengdu 610031, China
2.Key Laboratory of Cloud Computing and Intelligent Technology, Sichuan Province, Chengdu 610031, China
3.School of Mathematics, Southwest Jiaotong University, Chengdu 610031, China

Online:2015-12-15 Published:2015-12-30

基于经验分布的打车概率和等待时间预测

王诏远1，2，李天瑞1，2，程尧3，王跃1，2，易修文1，2

1.西南交通大学信息科学与技术学院，成都 610031
2.四川省云计算与智能技术高校重点实验室，成都 610031
3.西南交通大学数学学院，成都 610031

Abstract

Abstract: This paper presents an approach for predicting the probability of hitting a vacant taxi and waiting time of one passenger at a specified location and time. A solution that discretizes the map and uses the feature points to repair GPS trajectory is provided, which is suitable for big data problems. Based on the empirical distribution, a model for computing feature points and time points’ probability of hitting a vacant taxi and the waiting time is proposed by using the repaired GPS trajectory. The probability of hitting vacant taxi and the waiting time by the user-specified location and time is predicted according to the model. Alternatively an incremental learning method is introduced based on the model. A large-scale GPS trajectory data is managed and analyzed using Hadoop platform. The feasibility of the proposed solution is proven. Simulation experiments validate the performance of the model, which proves the model has a high accuracy and it may enhance with the increasing size of data. The reference table including the feature points, the time points’ probability and waiting time, does not increase with the increasing of GPS trajectory data. It shows that the proposed model has a good scalability.

Key words: taxi trajectory, prediction of probability of hitting vacant taxi, prediction of waiting time, Hadoop

摘要： 提出了一种预测乘客在指定位置和指定时间预测打车概率和等待时间的方法。设计了一种将地图离散化，使用特征点修复GPS轨迹的解决方案，且适用于大数据问题；在修复的GPS数据基础上提出了基于经验分布在等待特征点和时间点的打车概率和等待时间模型；并基于该模型预测用户指定位置和指定时间的打车概率。另外给出了基于该模型的增量学习的方法。大规模GPS轨迹数据使用Hadoop平台实现了管理和分析计算，证明了该方案的可行性；预测结果在仿真实验中取得了良好的效果，证明了模型具有较高的准确性，同时可以期望准确性随着数据量的增大而提升；另外该模型得到的特征点和特征时间概率和等待时间的参考表并不会随着GPS轨迹数据的增大而增大，证明了模型有良好的可扩展性。

关键词: 出租车轨迹, 打车概率预测, 等待时间预测, Hadoop

WANG Zhaoyuan1，2, LI Tianrui1，2, CHENG Yao3, WANG Yue1，2, YI Xiuwen1，2. Prediction of probability of hitting vacant taxi and waiting time based on empirical distribution[J]. Computer Engineering and Applications, 2015, 51(24): 254-259.

王诏远1，2，李天瑞1，2，程尧3，王跃1，2，易修文1，2. 基于经验分布的打车概率和等待时间预测[J]. 计算机工程与应用, 2015, 51(24): 254-259.

[1]	WU Dongyang, DOU Jianping, LI Jun. Design of Digital Twin System for Quadrotor [J]. Computer Engineering and Applications, 2021, 57(16): 237-244.
[2]	LI Leixiao, DENG Dan, LI Jie, WANG Yongsheng. All-to-All Comparison Computing Data Distribution Strategy Based on Particle Swarm Optimization [J]. Computer Engineering and Applications, 2021, 57(15): 109-117.
[3]	LIU Jun, LI Wei, WU Mengting, CHEN Qifeng. New Design of Image Parallel Processing Model Based on Hadoop Platform [J]. Computer Engineering and Applications, 2019, 55(6): 186-190.
[4]	WANG Jingyu, LUAN Junqing, TAN Yuesheng. Research on Big Data Access Control Model Based on Data Sensitivity [J]. Computer Engineering and Applications, 2019, 55(23): 70-77.
[5]	YIN Qiao1，2, WEI Zhanchen1，2, HUANG Qiulan1, SUN Gongxing1, SHI Jingyan1. Development and Application of Hadoop Massive Data Migration System [J]. Computer Engineering and Applications, 2019, 55(13): 66-71.
[6]	CAO Jingjing1, REN Xinxin2, XU Xianhao2. Research on Logistics Path Frequent Patterns Based on Parallel Apriori [J]. Computer Engineering and Applications, 2019, 55(11): 257-264.
[7]	WU Yaoyao1, YANG Geng1，2. Distributed File System Load Balancing in Cloud Environment [J]. Computer Engineering and Applications, 2019, 55(10): 67-72.
[8]	MA Zhen, HALIDAN Abudureyimu, LI Xitong. Research on access optimization of small files in massive sample data sets [J]. Computer Engineering and Applications, 2018, 54(22): 80-84.
[9]	WANG Yongchao, LU Mingming. Research and implementation of big data migration for financial industry [J]. Computer Engineering and Applications, 2018, 54(13): 93-99.
[10]	ZHANG Renqi, LI Jianhua, FAN Lei. Research on parallel strategy of convolution neural network in distributed environment [J]. Computer Engineering and Applications, 2017, 53(8): 1-7.
[11]	XIA Xiaoyun, ZHANG Renbin, XIE Rui, WANG Cong. MapReduce approach for defect inspection of TFT-LCD [J]. Computer Engineering and Applications, 2017, 53(5): 202-206.
[12]	MIAO Xiaolong1, CHEN Hao1, ZHONG Jiang2. Energy-conserving strategies of file storage based on cluster scale adjustment [J]. Computer Engineering and Applications, 2017, 53(24): 80-85.
[13]	LIU Shuoyang, ZHOU Lijuan, REN Zhongshan, ZHANG Shudong. HDFS load balancing in ophthalmic medical image file access [J]. Computer Engineering and Applications, 2017, 53(2): 253-259.
[14]	FENG Xingjie, HE Yang. Improvement of job scheduling algorithm on Hadoop [J]. Computer Engineering and Applications, 2017, 53(12): 85-91.
[15]	FENG Xingjie, WU Xiyu, ZHAO Jie, HE Yang, FANG Shu. Data warehouse of QAR based on Hive [J]. Computer Engineering and Applications, 2017, 53(11): 90-94.

Prediction of probability of hitting vacant taxi and waiting time based on empirical distribution

基于经验分布的打车概率和等待时间预测

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics