计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (2): 77-86.DOI: 10.3778/j.issn.1002-8331.2303-0035

• 理论与研发 • 上一篇    下一篇

结合判别分析和分布差异约束的领域适应方法

覃姜维,唐德玉   

  1. 广东药科大学 医药信息工程学院,广州 510006
  • 出版日期:2024-01-15 发布日期:2024-01-15

Domain Adaptation Method Combined with Discriminant Analysis and Distribution Discrepancy Constraints

QIN Jiangwei, TANG Deyu   

  1. College of Medical Information Engineering, Guangdong Pharmaceutical University, Guangzhou 510006, China
  • Online:2024-01-15 Published:2024-01-15

摘要: 为解决领域适应过程中基于全局分布适配的特征变换造成的类别结构损失和局部特征损失问题,提出一种结合判别分析和分布差异约束的领域适应方法。构造领域数据分布均值距离度量用于领域间分布适配;构造类散度度量用于保持类别判别结构;基于数据局部分布信息设计不同类型的差异权重,分别用于约束域分布距离度量和类散度度量,实现判别保持和局部保持的联合优化;基于上述度量最优化的特征变换,将源域和目标域数据投影到子空间中实施分类任务。所提出的方法在领域适应过程中不仅能够缩小领域间分布差异,且兼顾类别判别保持和数据局部特征的保持,能有效提升域外数据重用的性能。在28组跨领域分类任务上的实验结果表明,所提出的方法在评价指标上优于已有的相关方法。

关键词: 领域适应, 类内散度, 类间散度, 判别分析, 分布差异

Abstract: To solve the problem of category structure loss and local feature loss caused by feature transformation based on global distribution adaptation in the process of domain adaptation, a domain adaptation method combined with discriminant analysis and distribution discrepancy constraints is proposed. Firstly, the mean distance metric of the domain data distribution is constructed for domain distribution adaptation; secondly, the class divergence metric is constructed to maintain the class discriminant structure; meanwhile, based on the sample local distribution information, multiple discrepancy weights are designed to constrain the domain distribution discrepancy measure and the class discrimination measure respectively, which realizes the joint optimization of discriminant preservation and locality preservation; finally, based on the feature transformation optimized by the metrics aforementioned, the source domain data and target domain data are projected into the subspace for classification task. During the process of domain adaptation, the proposed method can not only reduce the distribution difference between domains, but also take into account the preservation of category discrimination and local characteristics of data, thus effectively improve the performance of reusage of out-of-domain data. The experimental results on 28 cross-domain classification tasks show that the proposed method significantly outperforms the other related methods in terms of the evaluation metric.

Key words: domain adaptation, inner-class scatter, intra-class scatter, discriminant analysis, distribution discrepancy