Big Data Cleaning Method for Bus Based on Spatiotemporal Correlation
XIE Zhiying, HE Yuanrong, LI Qingquan
1.School of Computing and Information Engineering, Xiamen University of Technology, Xiamen, Fujian 361024, China
2.Shenzhen Key Laboratory of Spatial Smart Sensing and Services, Shenzhen University, Shenzhen, Guangdong 518060, China
XIE Zhiying, HE Yuanrong, LI Qingquan. Big Data Cleaning Method for Bus Based on Spatiotemporal Correlation[J]. Computer Engineering and Applications, 2022, 58(1): 113-121.
[1] 叶鸥,张璟,李军怀.中文数据清洗研究综述[J].计算机工程与应用,2012,48(14):121-129.
YE O,ZHANG J,LI J H.Survey of Chinese data cleaning[J].Computer Engineering and Applications,2012,48(14):121-129.
[2] 郝爽,李国良,冯建华,等.结构化数据清洗技术综述[J].清华大学学报(自然科学版),2018,58(12):3-16.
HAO S,LI G L,FENG J H,et al.Survey of structured data cleaning methods[J].Journal of Tsinghua University(Science and Technology),2018,58(12):3-16.
[3] 郭志懋,周傲英.数据质量和数据清洗研究综述[J].软件学报,2002,13(11):2076-2082.
GUO Z M,ZHOU A Y.Research on data quality and data cleaning survey[J].Journal of Software,2002,13(11):2076-2082.
[4] 燕彩蓉,孙圭宁,高念高.基于扩展树状知识库的海量数据清洗算法[J].计算机工程与应用,2010,46(28):146-148.
YAN C R,SUN G N,GAO N G.Mass data cleaning algorithm based on extended tree-like knowledge base[J].Computer Engineering and Applications,2010,46(28):146-148.
[5] 王晓原,张敬磊,吴芳.交通流数据清洗规则研究[J].计算机工程,2011,37(20):191-193.
WANG X Y,ZHANG J L,WU F.Research on traffic flow data cleaning rules[J].Computer Engineering,2011,37(20):191-193.
[6] 耿彦斌,于雷,赵慧.ITS数据质量控制技术及应用研究[J].中国安全科学学报,2005,15(1):82-87.
GENG Y B,YU L,ZHAO H.ITS data quality control techniques and applications[J].China Safety Science Journal,2005,15(1):82-87.
[7] 袁瑶瑶,康雁,李浩,等.基于ST-DCGAN的时序交通流量数据补全[J].计算机工程与应用,2020,56(15):140-146.
YUAN Y Y,KANG Y,LI H,et al.Timing traffic flow data completion based on ST-DCGAN[J].Computer Engineering and Applications,2020,56(15):140-146.
[8] 孟鸿程,陈淑燕.交通流缺失数据处理方法比较分析[J].交通信息与安全,2018,36(2):61-67.
MENG H C,CHEN S Y.A comparative analysis of data imputation methods for missing traffic flow data[J].Journal of Transport Information and Safety,2018,36(2):61-67.
[9] 李林超,曲栩,张健,等.基于特征级融合的高速公路异质交通流数据修复方法[J].东南大学学报(自然科学版),2018,48(5):972-978.
LI L C,QU X,ZHANG J,et al.Missing value imputation method for heterogeneous traffic flow data based on feature fusion[J].Journal of Southeast University(Natural Science Edition),2018,48(5):972-978.
[10] 陆化普,孙智源,屈闻聪.基于时空模型的交通流故障数据修正方法[J].交通运输工程学报,2015,15(6):92-100.
LU H P,SUN Z Y,QU W C.Repair method of traffic flow malfunction data based on temporal-spatial model[J].Journal of Traffic and Transportation Engineering,2015,15(6):92-100.
[11] GILL S,LEE B.A framework for distributed cleaning of data streams[J].Procedia Computer Science,2015,52:1186-1191.
[12] 赖永炫,张璐,杨帆,等.基于时空相关属性模型的公交到站时间预测算法[J].软件学报,2020,31(3):648-662.
LAI Y X,ZHANG L,YANG F,et al.Bus arrival time prediction algorithm based on spatial-temporal correlation attribute model[J].Journal of Software,2020,31(3):648-662.
[13] 陆俊天,孙玲,施佺.基于门控循环单元神经网络的公交到站时间预测[J].南通大学学报(自然科学版),2020,19(2):43-49.
LU J T,SUN L,SHI Q.Prediction of bus arrival time based on gated recurrent unit neural networks[J].Journal of Nantong University(Natural Science Edition),2020,19(2):43-49.
[14] HAN Q W,LIU K,ZENG L Q,et al.A bus arrival time prediction method based on position calibration and LSTM[J].IEEE Access?2020,8:42372-42383.
[15] WANG X X,XU L H.Data-driven short-term forecasting for urban road network traffic based on data processing and LSTM-RNN[J].Arabian Journal for Science & Engineering(Springer Science & Business Media B V),2019,44(4):3043-3060.
[16] 贾俊平,何晓群,金勇.统计学[M].第4版.北京:中国人民大学出版社,2009:89-90.
JIA J P,HE X Q,JIN Y.Statistics[M].4th ed.Beijing:China People’s University Press,2009,89-90.
[17] SUNG K,BELL M G H,SEONG M,et al.Shortest paths in a network with time-dependent flow speeds[J].European Journal of Operational Research,2000,121:32-39.
[18] 朱红旗.基于GIS的城市公交线网数据模型研究[J].苏州科技学院学报(自然科学版),2011(2):57-60.
ZHU H Q,GIS-based data model for urban public transit network[J].Journal of Suzhou University of Science and Technology(Natural Science Edition),2011(2):57-60.
[19] GERS F A,SCHMIDHUBER J,CUMMINS F,et al.Learning to forget:Continual prediction with LSTM[J].Neural Computation,2000,12(10):2451-2471.
[20] GERS F A,SCHRAUDOLPH N,SCHMIDHUBER J,etal.Learning precise timing with lstm recurrent networks[J].Journal of Machine Learning Research,2003,3(1):115-143.