计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (4): 75-88.DOI: 10.3778/j.issn.1002-8331.2306-0406

• 热点与综述 • 上一篇    下一篇

概念漂移检测与适应方法综述

孟凡兴,韩萌,李春鹏,张瑞华,何菲菲   

  1. 北方民族大学 计算机科学与工程学院,银川  750021
  • 出版日期:2024-02-15 发布日期:2024-02-15

Survey of Concept Drift Detection and Adaptation Methods

MENG Fanxing, HAN Meng, LI Chunpeng, ZHANG Ruihua, HE Feifei   

  1. School of Computer Science and Engineering, North Minzu University, Yinchuan 750021, China
  • Online:2024-02-15 Published:2024-02-15

摘要: 随着数据信息的急速增长,数据流自身特征呈现多样化,这对概念漂移处理方法的研究提出了新的要求。以往基于性能退化的检测思路无法应对标签非即时可用环境,由此推动了无标签检测方法的研究。集成与增量学习的研究推动了漂移适应方法研究的进步。分别针对标签是否即时可用两种情况对漂移检测方法进行详细分析,同时以模型组成为基础,重点从集成角度对漂移适应方法进行分析说明。针对不同思路的漂移处理方法,从核心思路、对比模型、优点与局限性等方面进行全面总结,同时将不同思路的漂移处理方法进行整体对比。给出了该领域未来的研究方向,包括噪声区分、阈值设定与指标多样化的研究。

关键词: 漂移检测, 漂移适应, 标签可用性, 模型组成

Abstract: With the rapid growth of data information, the characteristics of data stream are diversified, which put forward new requirements for the study of concept drift processing methods. The previous detection ideas based on performance degradation cannot cope with the non-instant availability environment of labels, which promotes the research of un-labeled detection methods. The research of ensemble and incremental learning promotes the progress of drift adaptation method. Firstly, the drift detection methods are analyzed in detail in two cases, namely whether the label is available and immediate or not, and based on the model composition, the drift adaptation methods are analyzed and explained with emphasis on the ensemble method. Secondly, the drift processing methods of different ideas are comprehensively summarized from the core ideas, comparison models, advantages and limitations, and the drift processing methods of different ideas are compared as a whole. Finally, the future research directions in this field are given, including noise discrimination, threshold setting and criterion diversification.

Key words: drift detection, drift adaptation, label availability, model composition