Computer Engineering and Applications ›› 2020, Vol. 56 ›› Issue (16): 241-247.DOI: 10.3778/j.issn.1002-8331.1905-0259
Previous Articles Next Articles
WANG Xi, YUAN Shaoxin
Online:
Published:
王晰,袁绍欣
Abstract:
Travel time samples collected from a special road for a time period can be obtained from the Automatic Number Plate Recognition(ANPR) data. However, the samples often contain a fair amount of noise travel time that can not represent the usual traffic conditions. Filtering out these noise travel time can derive the valuable valid travel time representing the usual traffic conditions. Hence, an algorithm is presented to fit the distribution of data samples with the log-normal mixture model, and the two criteria of optimal [K] value of component distribution are given to make a best classification of two types of travel time with respect to the characteristics of right tail portion of distribution of noise travel time, so the noise travel time can be identified and the valid travel time can be extracted. The algorithm also can discern the samples with the weak characteristics of noise travel time, and extract the valid travel time from the samples between the 10th and 90th percentile. The experiments using the algorithm on the actual ANPR data coming from the buses and non-bus vehicles gain a good effect in identifying the noise travel time. The experimental results show that there are obvious differences of mean and standard deviation of travel time for non-bus vehicles and standard deviation for buses before and after filtering the noise travel time. This indicates that the wrong judgment for the running state for the two traffic modes under the normal traffic conditions would be made without filtering out the noise travel time.
Key words: travel time, noise travel time, mixture model, log-normal distribution, data clustering
摘要:
从车牌识别数据中可以得到车辆在特定道路与特定时间段的旅行时间数据样本,但样本中往往混有不反映通常交通状况的噪音数据,去除这些噪音数据后可得到能够反映通常交通状况的有价值的有效数据。为此提出算法采用对数正态分布混合模型对数据样本进行拟合,并利用噪音数据具有右向尾部的分布特点给出确定最优子分布数量的两个判据,使两类数据具有最佳的聚类效果,从而能识别和提取出有效数据。算法对噪音数据特征不明显的少量数据样本也给出了提取方法,将第10百分位和第90百分位之间的数据作为有效数据。该算法针对公交车和非公交车两类车型的车牌识别数据进行实验,对噪音数据的识别取得了良好效果。实验结果表明,有效数据提取前后,非公交车通常状况的旅行时间平均值和标准差以及公交车旅行时间标准差具有明显差异,不滤除噪音数据会对两类车通常交通状况下的运行状态产生误判。
关键词: 旅行时间, 噪音数据, 混合模型, 对数正态分布, 数据聚类
WANG Xi, YUAN Shaoxin. Algorithm for Extracting Valid Travel Time from Automatic Number Plate Recognition Data[J]. Computer Engineering and Applications, 2020, 56(16): 241-247.
王晰,袁绍欣. 从车牌识别数据中提取有效旅行时间算法研究[J]. 计算机工程与应用, 2020, 56(16): 241-247.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://cea.ceaj.org/EN/10.3778/j.issn.1002-8331.1905-0259
http://cea.ceaj.org/EN/Y2020/V56/I16/241