计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (7): 255-266.DOI: 10.3778/j.issn.1002-8331.2311-0328

• 图形图像处理 • 上一篇    下一篇

知识融入多源多任务学习的眼底图像分类方法

吴瑞琪,周毅   

  1. 1.东南大学 计算机科学与工程学院,南京 211189
    2.东南大学 新一代人工智能技术与交叉应用教育部重点实验室,南京 211189
  • 出版日期:2025-04-01 发布日期:2025-04-01

Multi-Source Multi-Task Learning with Knowledge Integration for Fundus Disease Classification

WU Ruiqi, ZHOU Yi   

  1. 1.School of Computer Science and Engineering, Southeast University, Nanjing 211189, China
    2.Key Laboratory of New Generation Artificial Intelligence and Cross Applications (Ministry of Education), Southeast University, Nanjing 211189, China
  • Online:2025-04-01 Published:2025-04-01

摘要: 针对传统眼底图像单任务模型泛化性受限,通用性差且可解释性较差的问题,提出了基于知识融入与多源多任务学习的眼底图像分类模型。为利用多种眼底疾病和生物标志间的关联,从多任务模型架构的角度,提出了多区域多专家分类模型。先识别出视盘和黄斑作为先验知识,再从整张眼底图、视盘和黄斑三个角度建立三个专家模型学习多区域特征,并提出特征协调模块融合特征。为缓解多源标签空间偏移和训练梯度冲突问题,从多源标签空间统一的角度,基于疾病病灶层级关系先验知识提出了二级层次化标签空间和空类交叉熵函数。为缓解多任务优化困难和梯度冲突,提出了多源联合训练算法。经过充分的对比实验、消融实验和迁移实验验证,提出的模型取得了显著增益(4.19~13.28个百分点),具有更强的通用性、泛化性和迁移性。

关键词: 眼底多疾病分类, 多任务学习, 多源学习, 知识融入, 机器学习

Abstract: Traditional single-task fundus disease model has poor generalization and low interpretability. Aiming at these problems, multi-source multi-task learning with knowledge integration for fundus disease classification model (MMKI) is proposed. From the view of multi-task architecture, multi-region multi-expert classification model is constructed. After recognizing the optic disc and macular as the prior knowledge, expert models are established from three perspectives to learn multi-region features, which is fused by feature reconciliation module. In addition, for the alleviation of the multi-source label space migration and gradient conflicts, two-level hierarchical label space and null class cross entropy function is introduced based on prior knowledge of hierarchical disease relationship from the perspective of multi-source label space unification. Moreover, multi-source joint training strategy is proposed to alleviate the training difficulties and gradient conflicts. Sufficient comparison experiments, ablation experiments and transferability experiments show that MMKI achieves significant gains (4.19~13.28 percentage points) with stronger migration and generalization.

Key words: multi-disease diagnosis of fundus images, multi-task learning, multi-source learning, knowledge integration, machine learning