计算机工程与应用 ›› 2021, Vol. 57 ›› Issue (24): 212-218.DOI: 10.3778/j.issn.1002-8331.2007-0385

• 图形图像处理 • 上一篇    下一篇

融合注意力机制与多任务学习的DR分级模型

徐常转,吴云,蓝林,黄自萌   

  1. 贵州大学 计算机科学与技术学院,贵阳 550025
  • 出版日期:2021-12-15 发布日期:2021-12-13

DR Classification Model of Fusing Attention Mechanism and Multi-tasking Learning

XU Changzhuan, WU Yun, LAN Lin, HUANG Zimeng   

  1. School of Computer Science and Technology, Guizhou University, Guiyang 550025, China
  • Online:2021-12-15 Published:2021-12-13

摘要:

在糖尿病患者中,糖尿病视网膜病变(Diabetic Retinopathy,DR)是导致失明的主要原因。针对眼底图像中存在极难发现的微动脉瘤等微小病理特征的问题,提出了一种注意力机制模块。该模块通过融合特征图原本的特征信息与注意力单元得到的通道信息,为微小特征增加了网络的权重,再使用除操作去除特征图中的冗余信息,得到注意力机制特征作为双任务的输入;针对均方误差(Mean Square Error,MSE)损失难优化和交叉熵(Cross Entropy,CE)损失未考虑错分DR等级的代价,设计了多任务学习模块,加权融合了回归任务的MSE损失和分类任务的CE损失。基于这两个模块的设计,提出了融合注意力机制的多任务学习网络(Fusion of Attention mechanism and Multi-Tasking learning network,FAMT)。在kaggle数据集上的实验表明,FAMT网络在验证集上的Kappa比仅使用回归任务的网络高出了2%,比仅使用分类任务的网络提高了4%;FAMT网络在测试集上的Kappa比EfficientNet网络高出1%,比M2CNN网络高出了5%。

关键词: 糖尿病视网膜分级, 深度学习, 注意力机制, 多任务学习, 卷积神经网络

Abstract:

In diabetic patients, Diabetic Retinopathy(DR) is the main cause of blindness. Based on extremely difficult to find in the fundus images of tiny pathological characteristics such as microaneurysm, an attention mechanism module is proposed. Through fusing the original feature information of feature map and channel information obtained by attention unit, the weight of the network is increased for the tiny feature, the redundant information in the graph is removed, by the division operation, attention mechanism features are obtained as the input into the double tasks. For the Mean Square Error(MSE) loss is difficult to optimize and Cross Entropy(CE) loss does not consider the cost of misclassified DR level, a multi-task learning module is designed, regression MSE loss of tasks and CE loss of classification tasks are weighted fusion. Based on the design of the two modules, a Fusion of Attention mechanism and Multi-Tasking learning network(FAMT) is proposed. Experiments on the kaggle dataset show that FAMT network has a Kappa on the validation set that is 2% higher than the network that only uses regression tasks, and 4% higher than the network that only uses classification tasks. The Kappa ratio of the FAMT network on the test set is 1% higher than the EfficientNet network and 5% higher than the M2CNN network.

Key words: diabetic retina grading, deep learning, attention mechanism, multi-task learning, convolutional neural network