计算机工程与应用 ›› 2025, Vol. 61 ›› Issue (10): 320-330.DOI: 10.3778/j.issn.1002-8331.2402-0218

• 工程与应用 • 上一篇    下一篇

融合多视图特征的放射学报告生成

欧佳乐,昝红英,张坤丽,师相龙,马玉团   

  1. 郑州大学 计算机与人工智能学院,郑州 450001
  • 出版日期:2025-05-15 发布日期:2025-05-15

Radiology Report Generation Integrating Multi-View Features

OU Jiale, ZAN Hongying, ZHANG Kunli, SHI Xianglong, MA Yutuan   

  1. School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450001, China
  • Online:2025-05-15 Published:2025-05-15

摘要: 放射学报告生成涉及从多源图像中提取特征并将其转化为文本描述。当前的研究面临着多视图和不同长度报告的挑战,导致生成的临床报告准确性不足和语义不连贯。针对这些问题,提出了一种融合多视图特征的方法,通过从原始图像中进行多次局部特征提取和细粒度融合减少了信息丢失。通过标注工具获得并嵌入全局上下文表示,让模型在训练时使用更具概括性的文本,以获得更为流畅的描述。在IU X-Ray和MIMIC-CXR两个数据集上的实验表明,该方法在R2Gen模型上的应用使生成报告的质量得分平均提升了2.96个百分点。此外在自行构建的中文肺部CT报告数据集上进行了影像报告到诊断结论的生成实验,表现了该方法的通用性。

关键词: 放射学报告生成, 多视图, 细粒度融合, 全局上下文

Abstract: Radiology report generation involves extracting features from multiple source images and converting them into textual descriptions. The current research faces challenges related to multiple views and varying report lengths, resulting in insufficient accuracy and semantic incoherence in the generated clinical reports. To address these issues, a method that integrates features from multiple views is proposed, reducing information loss through multiple local feature extractions and fine-grained fusion from original images. Global context representation is obtained and embedded by using annotation tools, allowing the model to use more comprehensive text during training for smoother descriptions. Experiments on IU X-Ray and MIMIC-CXR datasets show an average improvement of 2.96 percentage points in report quality scores with the application of this method on the R2Gen model. Furthermore, experiments on a self-constructed Chinese lung CT report dataset for image report to diagnostic conclusion generation demonstrate the generality of the proposed method.

Key words: radiology report generation, multiple views, fine-grained fusion, global context