计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (10): 331-340.DOI: 10.3778/j.issn.1002-8331.2402-0041

• 工程与应用 • 上一篇    下一篇

CNN-Transformer交互模型预测IgA肾病病理分级

牛昊天,林宇轩,蔡念,谢依颖,张镭   

  1. 1.广东工业大学 信息工程学院,广州 510006
    2.南方医科大学南方医院 肾内科,广州 510515
  • 出版日期:2025-05-15 发布日期:2025-05-15

CNN-Transformer Interaction Model Prediction Pathological Stratification of IgA Nephropathy

NIU Haotian, LIN Yuxuan, CAI Nian, XIE Yiying, ZHANG Lei   

  1. 1.School of Information Engineering, Guangdong University of Technology, Guangzhou 510006, China
    2.Department of Nephrology, Southern Medical University Nanfang Hospital, Guangzhou 510515, China
  • Online:2025-05-15 Published:2025-05-15

摘要: 肾脏超声影像检查是IgA肾病重要的临床无创诊断方法,可以避免不必要的肾活检,尤其对患者长期病程管理至关重要。但是超声影像解析与肾活检病理分析仍具有巨大知识鸿沟,导致临床上难以直接通过超声影像解析进行精准的IgA肾病病理分级,其诊断仍然高度依赖于IgA肾病病理的分析。融合卷积神经网络(convolutional neural network,CNN)和Transformer设计了一种CNN-Transformer交互模型,通过肾脏超声影像的自动解析,实现IgA肾病病理分级预测,辅助医生对疾病进行诊断。在该交互模型中,为了模拟医生阅片观察肾脏皮髓质区的局部纹理形态,融入可变形卷积设计了CNN结构流提取肾脏局部特征;为了模拟医生阅片观察肾脏整体组织形态,融入空间偏置注意力设计了Transformer结构流提取肾脏组织全局关联关系;为了模拟医生阅片反复交替观察各组织形态进行综合评估的过程,提出了一种中期特征交互策略和一种终端自适应融合策略,全面揭示超声影像蕴含的医学信息。实验结果表明,所提出的CNN-Transformer交互模型能够有效地通过肾脏超声影像进行IgA肾病病理分级,准确率达到0.842,灵敏度达到0.833,特异性达到0.876,AUC达到0.931,优于多个现有超声影像深度学习方法。

关键词: IgA肾病, 超声, 深度学习, 混合模型, 双向交互, 自适应融合

Abstract: Renal ultrasound examination is an important non-invasive clinical diagnostic method for IgA nephropathy (IgAN), which can avoid unnecessary renal biopsy and is especially crucial for long-term disease management. However, there is still a huge knowledge gap between ultrasound image analysis and renal biopsy pathology analysis, which makes it difficult to perform accurate IgAN pathology stratification directly from ultrasound images in clinical practice. Thus, IgAN diagnosis is still highly dependent on the analysis of its pathology. In this paper, a CNN-Transformer interaction model is designed by fusing convolutional neural network (CNN) and Transformer to achieve the prediction of IgAN pathology stratification through the automatic parsing of renal ultrasound images, which assists doctors in its diagnosis. In this model, to simulate the doctor observing the local texture morphology of the corticomedullary region of the kidney, a CNN stream is designed by integrating deformable convolution to extract the local features of the kidney. To simulate the observations of the overall tissue morphology of the kidney, a Transformer stream is designed with spatial biased attention to extract the global correlations of the renal tissues. To simulate the alternate observations of renal tissue morphology for comprehensive assessment, an intermediate-term feature interaction strategy and a terminal adaptive fusion strategy are proposed to comprehensively reveal the medical information contained in ultrasound images. The experimental results show that the proposed CNN-Transformer interaction model performs well in IgAN pathology stratification through renal ultrasound images with an accuracy of 0.842, a sensitivity of 0.833, a specificity of 0.876, and an AUC of 0.931, which outperforms several existing deep learning methods for ultrasound images.

Key words: IgA nephropathy, ultrasound, deep learning, hybrid model, bidirectional interaction, adaptive fusion