计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (3): 84-93.DOI: 10.3778/j.issn.1002-8331.2108-0068

• 模式识别与人工智能 • 上一篇    下一篇

采用密度比估计的多窗口变点检测方法

张曼,崔文泉   

  1. 中国科学技术大学 管理学院 统计与金融系,合肥 230026
  • 出版日期:2023-02-01 发布日期:2023-02-01

Multi-Window Change Point Detection Method Using Density Ratio Estimation

ZHANG Man, CUI Wenquan   

  1. Department of Statistics and Finance, School of Management, University of Science and Technology of China, Hefei 230026, China
  • Online:2023-02-01 Published:2023-02-01

摘要: 针对基于密度比估计的时间序列变点检测方法受时间窗窗宽限制,识别变点类型单一的问题,利用和发展动态多重过滤算法MFA(multiple filtering algorithm),提出一种多窗口变点检测方法mDRCPD(multiple window density-ratio change point detection)。将处理后的时间序列按多个时间窗进行适当划分,通过比较相邻时间窗数据的分布差异来识别变点,采用基于密度比估计的相对皮尔逊散度来度量不同时间窗数据分布的差异性;固定窗宽下寻找变点集,并按照MFA方法集成各变点集。模拟实验和实证分析表明,与基于密度比的单窗口变点检测方法相比,mDRCPD方法在多变点时间序列变点检测中绝对误差、召回率、F1得分等指标均有改善。将mDRCPD方法应用到COVID-19的传播进程分析中,通过对传播率的分段建模来刻画疫情的阶段性特点,评估国家政策在降低疫情传播速度上的效果。

关键词: 时间序列, 变点检测, 密度比估计, COVID-19, 多窗口, 多重过滤算法

Abstract: Aiming at the problem that the density ratio estimation-based time series change point detection method is limited by the time window width and only few types of change points are recognized, a multi-window detection method mDRCPD(multiple window density-ratio change point detection) based on dynamic multiple filtering algorithm(MFA) is proposed. Firstly, the processed time series are divided into time windows, the change points are identified by comparing the distribution differences of adjacent time window data, and the relative Pearson divergence based on density ratio estimation is used to measure the difference of data in different time windows. Secondly, change point set is obtained under fixed window width, and the change point sets are integrated according to MFA. It is showed in simulation experiments and real data analysis that compared with the single-window change point detection method, the absolute error, recall rate, F1 score and other indicators in the detection of mean, variance, and frequency changes by mDRCPD method is significantly improved. Finally, it is also applied to the analysis of the transmission process of COVID-19, and the phased characteristics of the epidemic is described, the effect of national policies on reducing the spread of the epidemic is evaluated by piecewise modeling of the transmission rate.

Key words: time series, change point detection, density ratio estimation, COVID-19, multiple windows, multiple filtering algorithm