计算机工程与应用 ›› 2020, Vol. 56 ›› Issue (7): 155-161.DOI: 10.3778/j.issn.1002-8331.1812-0188

• 模式识别与人工智能 • 上一篇    下一篇

基于异构分类器集成的增量学习算法

熊霖,唐万梅   

  1. 重庆师范大学 计算机与信息科学学院,重庆 401331
  • 出版日期:2020-04-01 发布日期:2020-03-28

Incremental Learning Algorithm Based on Heterogeneous Classifier Ensemble

XIONG Lin, TANG Wanmei   

  1. College of Computer and Information Science, Chongqing Normal University, Chongqing 401331, China
  • Online:2020-04-01 Published:2020-03-28

摘要:

将集成学习的思想引入到增量学习之中可以显著提升学习效果,近年关于集成式增量学习的研究大多采用加权投票的方式将多个同质分类器进行结合,并没有很好地解决增量学习中的稳定-可塑性难题。针对此提出了一种异构分类器集成增量学习算法。该算法在训练过程中,为使模型更具稳定性,用新数据训练多个基分类器加入到异构的集成模型之中,同时采用局部敏感哈希表保存数据梗概以备待测样本近邻的查找;为了适应不断变化的数据,还会用新获得的数据更新集成模型中基分类器的投票权重;对待测样本进行类别预测时,以局部敏感哈希表中与待测样本相似的数据作为桥梁,计算基分类器针对该待测样本的动态权重,结合多个基分类器的投票权重和动态权重判定待测样本所属类别。通过对比实验,证明了该增量算法有比较高的稳定性和泛化能力。

关键词: 增量学习, 集成学习, 局部敏感哈希, 异构分类器集成, 动态权重

Abstract:

Introducing the idea of ensemble learning into incremental learning can improve the learning effect. In recent years, most of the research on ensemble incremental learning combines multiple homogeneous classifiers with weighted voting method, which does not solve the problem of stability-plasticity in incremental learning very well. An incremental learning algorithm based on heterogeneous classifier ensemble is proposed. In the stage of training, to make the ensemble model more stable, many base classifiers are trained with new data and then append into heterogeneous ensemble model. Meanwhile, Locality-Sensitive Hashing is used to save the data sketch for the nearest neighbor search of test sample. In order to adapt to the changing data, the newly acquired data will be used to update the voting weight of the base classifier in the ensemble model. In the prediction stage, for the class label prediction of the test sample, the data similar to the test sample is found from the Local-Sensitive Hashing table, this data is used as a bridge to calculate the dynamic weight of the base classifier for the test sample. It determines the class label of the test sample by combined the voting weight and dynamic weight of many base classifiers. Through comparative experiments, it is proved that the proposed algorithm has well stability and generalization ability.

Key words: incremental learning, ensemble learning, Locality-Sensitive Hashing(LSH), heterogeneous classifiers ensemble, dynamic weight