[1] YANG Y, YU J, ZHANG J, et al. Joint embedding of deep visual and semantic features for medical image report generation[J]. IEEE Transactions on Multimedia, 2023, 25: 167-178.
[2] TANG Q, YU Y B, FENG X, et al. Semantic and visual enrichment hierarchical network for medical image report generation[C]//Proceedings of the 2022 Asia Conference on Algorithms, Computing and Machine Learning. Piscataway: IEEE, 2022: 738-743.
[3] JING B Y, XIE P T, XING E. On the automatic generation of medical imaging reports[J]. arXiv:1711.08195, 2017.
[4] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 6000-6010.
[5] 欧佳乐, 昝红英, 张坤丽, 等. 融合多视图特征的放射学报告生成[J]. 计算机工程与应用, 2025, 61(10): 320-330.
OU J L, ZAN H Y, ZHANG K L, et al. Radiology report generation integrating multi-view features[J]. Computer Engineering and Applications, 2025, 61(10): 320-330.
[6] 谭立玮, 张淑军, 韩琪, 等. 面向医学影像报告生成的门归一化编解码网络[J]. 智能系统学报, 2024, 19(2): 411-419.
TAN L W, ZHANG S J, HAN Q, et al. Gate normalized encoder-decoder network for medical image report generation[J]. CAAI Transactions on Intelligent Systems, 2024, 19(2): 411-419.
[7] ZHOU Y, HUANG L, ZHOU T, et al. Visual-textual attentive semantic consistency for medical report generation[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2022: 3965-3974.
[8] LIU F L, YIN C C, WU X, et al. Contrastive attention for automatic chest X-ray report generation[C]//Proceedings of the Findings of the Association for Computational Linguistics. Stroudsburg: ACL, 2021: 269-280.
[9] ZHOU S H, NIE D, ADELI E, et al. High-resolution encoder decoder networks for low-contrast medical image segmentation[J]. IEEE Transactions on Image Processing, 2020, 29: 461-475.
[10] WANG L T, ZHANG L, SHU X, et al. Intra-class consistency and inter-class discrimination feature learning for automatic skin lesion classification[J]. Medical Image Analysis, 2023, 85: 102746.
[11] WANG F Y, ZHOU Y Y, WANG S J, et al. Multi-granularity cross-modal alignment for generalized medical visual representation learning[C]//Proceedings of the 36th International Conference on Neural Information Processing Systems. New York: ACM, 2022: 33536-33549.
[12] SONG X, ZHANG X D, JI J Z, et al. Cross-modal contrastive attention model for medical report generation[C]//Proceedings of the 29th International Conference on Computational Linguistics, 2022: 2388-2397.
[13] LIN Z H, ZHANG D H, SHI D L, et al. Contrastive pre-training and linear interaction attention-based transformer for universal medical reports generation[J]. Journal of Biomedical Informatics, 2023, 138: 104281.
[14] LI M J, LIN B Q, CHEN Z C, et al. Dynamic graph enhanced contrastive learning for chest X-ray report generation[C]//Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 3334-3343.
[15] CAO Y M, CUI L Z, YU F Q, et al. KdTNet: medical image report generation via knowledge-driven transformer[C]//Proceedings of the International Conference on Database Systems for Advanced Applications. Cham: Springer, 2022: 117-132.
[16] 史继筠, 张驰, 王禹桥, 等. 基于知识辅助的结构化医疗报告生成[J]. 计算机科学, 2024, 51(6): 317-324.
SHI J Y, ZHANG C, WANG Y Q, et al. Generation of structured medical reports based on knowledge assistance[J]. Computer Science, 2024, 51(6): 317-324.
[17] NOORALAHZADEH F, PEREZ GONZALEZ N, FRAUENFELDER T, et al. Progressive transformer-based generation of radiology reports[C]//Proceedings of the Findings of the Association for Computational Linguistics. Stroudsburg: ACL, 2021: 2824-2832.
[18] CHEN Z H, SONG Y, CHANG T H, et al. Generating radiology reports via memory-driven transformer[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2020: 1439-1449.
[19] CHEN Z H, SHEN Y L, SONG Y, et al. Cross-modal memory networks for radiology report generation[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Stroudsburg: ACL, 2021: 5904-5914.
[20] XU L M, TANG Q, ZHENG B C, et al. CGFTrans: cross-modal global feature fusion transformer for medical report generation[J]. IEEE Journal of Biomedical and Health Informatics, 2024, 28(9): 5600-5612.
[21] ZHANG J S, CHENG M, CHENG Q Q, et al. Hierarchical medical image report adversarial generation with hybrid discriminator[J]. Artificial Intelligence in Medicine, 2024, 151: 102846.
[22] TANG Y H, HAN K, GUO J Y, et al. An image patch is a wave: phase-aware vision MLP[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 10925-10934.
[23] IRVIN J, RAJPURKAR P, KO M, et al. CheXpert: a large chest radiograph dataset with uncertainty labels and expert comparison[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2019: 590-597.
[24] JOHNSON J, DOUZE M, JéGOU H. Billion-scale similarity search with GPUs[J]. IEEE Transactions on Big Data, 2021, 7(3): 535-547.
[25] DEMNER-FUSHMAN D, KOHLI M D, ROSENMAN M B, et al. Preparing a collection of radiology examinations for distribution and retrieval[J]. Journal of the American Medical Informatics Association, 2016, 23(2): 304-310.
[26] JOHNSON A E W, POLLARD T J, GREENBAUM N R, et al. MIMIC-CXR-JPG, a large publicly available database of labeled chest radiographs[J]. arXiv: 1901.07042, 2019.
[27] PAPINENI K, ROUKOS S, WARD T, et al. BLEU: a method for automatic evaluation of machine translation[C]//Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Morristown: ACL, 2001: 311.
[28] BANERJEE S, LAVIE A. METEOR: an automatic metric for MT evaluation with improved correlation with human judgments[C]//Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, 2005: 65-72.
[29] LIN C Y. ROUGE: a package for automatic evaluation of summaries[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2004.
[30] WANG R Z, WANG X T, XU Z H, et al. MvCo-DoT: multi-view contrastive domain transfer network for medical report generation[C]//Proceedings of the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2023: 1-5.
[31] ZENG X H, LIAO T X, XU L M, et al. AERMNet: attention-enhanced relational memory network for medical image report generation[J]. Computer Methods and Programs in Biomedicine, 2024, 244: 107979.
[32] TAO Y T, MA L Y, YU J, et al. Memory-based cross-modal semantic alignment network for radiology report generation[J]. IEEE Journal of Biomedical and Health Informatics, 2024, 28(7): 4145-4156.
[33] CHEN W T, SHEN L L, LIN J Y, et al. Fine-grained image-text alignment in medical imaging enables explainable cyclic image-report generation[C]//Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2024: 9494-9509. |