Computer Engineering and Applications ›› 2018, Vol. 54 ›› Issue (3): 142-149.DOI: 10.3778/j.issn.1002-8331.1608-0396

Previous Articles     Next Articles

Transfer learning based on compact coding

SHAO Hao   

  1. Shanghai University of International Business and Economics, Shanghai 200336, China
  • Online:2018-02-01 Published:2018-02-07

基于压缩编码的迁移学习算法研究

邵  浩   

  1. 上海对外经贸大学,上海 200336

Abstract: In real world applications such as manufacturing, a new task is often related to another existing task. Transfer learning techniques are developed to build novel models on new tasks by extracting useful information from the existing models, to reduce the high cost of inquiring the labeled information for the target task. However, how to avoid negative transfer which happens due to different distributions of tasks in a heterogeneous environment is still an open problem. Unlike traditional methods which only measure either similarity between tasks or instance relatedness, a Transfer Learning method with Compact Coding(TLCC) is proposed under a two-level framework in inductive transfer learning setting. Particularly speaking, in the macro level perspective, the degree of the similarity is represented by the relevant code length of the class boundary of each source task with respect to the target task through minimum encoding. In addition, informative instances of the source tasks are adaptively selected in the micro level viewpoint to make the choice of the specific source task more accurate. Extensive experiments show the effectiveness of the algorithm in terms of the classification accuracy in both UCI and text data sets.

Key words: compact coding, classification, negative transfer, transfer learning

摘要: 在生产实际中,一个新的任务通常和已有任务存在一定的联系。迁移学习方法可以将已有数据集中的有用信息,迁移到新的任务,以减少重新建模过程中大量的时间和费用消耗。然而,由于任务之间的分布差异,在异构环境下如何避免负面迁移问题,仍未得到有效的解决。除了要衡量数据间的相似性,还需要衡量实例间的相关性,而大多数传统方法仅在一个层面进行操作。提出了基于压缩编码的迁移学习方法(TLCC),建立了两个层面的算法模型,具体来说,在数据层面,数据间的相似性可以表示为超平面分类器的编码长度,而在实例层面,通过进一步挑选出有价值的实例进行迁移,提升算法性能,避免负面迁移的发生。实验结果表明,提出的算法相比其他算法具有明显的优势,在噪声环境下也有较高的准确度。

关键词: 压缩编码, 分类, 负面迁移, 迁移学习