计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (8): 41-55.DOI: 10.3778/j.issn.1002-8331.2206-0022
石磊,籍庆余,陈清威,赵恒毅,张俊星
出版日期:
2023-04-15
发布日期:
2023-04-15
SHI Lei, JI Qingyu, CHEN Qingwei, ZHAO Hengyi, ZHANG Junxing
Online:
2023-04-15
Published:
2023-04-15
摘要: 深度自注意力网络(Transformer)对输入信息全局特征和长距离相关性具有天然良好的建模能力,其与卷积神经网络(CNN)的归纳偏置特性具有较强互补性。受其在自然语言处理领域取得巨大成功的启发,Transformer已被广泛引入到计算机视觉各项任务特别是医学图像分析领域并已取得了不俗表现。对Transformer与自然图像结合的典型工作进行介绍,根据视觉Transformer在医学图像分割、医学图像分类以及医学图像配准等子领域对相关工作按照不同病灶及部位进行了整理和归纳,重点对一些代表性研究工作的实现思想进行了详细分析。对现有研究工作进行了讨论并对未来方向进行了展望,以期为该领域的进一步深入研究提供参考。
石磊, 籍庆余, 陈清威, 赵恒毅, 张俊星. 视觉Transformer在医学图像分析中的应用研究综述[J]. 计算机工程与应用, 2023, 59(8): 41-55.
SHI Lei, JI Qingyu, CHEN Qingwei, ZHAO Hengyi, ZHANG Junxing. Review of Research on Application of Vision Transformer in Medical Image Analysis[J]. Computer Engineering and Applications, 2023, 59(8): 41-55.
[1] HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:770-778. [2] SZEGEDY C,VANHOUCKE V,IOFFE S,et al.Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:2818-2826. [3] TAN M,LE Q.Efficientnet:rethinking model scaling for convolutional neural networks[C]//International Conference on Machine Learning,2019:6105-6114. [4] CHEN L C,PAPANDREOU G,SCHROFF F,et al.Rethinking atrous convolution for semantic image segmentation[J].arXiv:1706.05587,2017. [5] RONNEBERGER O,FISCHER P,BROX T.U-net:convolutional networks for biomedical image segmentation[C]//International Conference on Medical Image Computing and Computer-assisted Intervention.Cham:Springer,2015:234-241. [6] HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018:7132-7141. [7] WANG H,ZHU Y,GREEN B,et al.Axial-deeplab:stand-alone axial-attention for panoptic segmentation[C]//European Conference on Computer Vision.Cham:Springer,2020:108-126. [8] WANG Q,WU B,ZHU P,et al.Eca-net:efficient channel attention for deep convolutional neural networks[J].arXiv:1910.03151,2019. [9] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems,2017. [10] DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.An image is worth 16x16 words:transformers for image recognition at scale[J].arXiv:2010.11929,2020. [11] TOUVRON H,CORD M,DOUZE M,et al.Training data-efficient image transformers & distillation through attention[C]//International Conference on Machine Learning,2021:10347-10357. [12] LIU Z,LIN Y,CAO Y,et al.Swin transformer:hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision,2021:10012-10022. [13] YUAN L,CHEN Y,WANG T,et al.Tokens-to-token vit:training vision transformers from scratch on imagenet[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision,2021:558-567. [14] CHEN X,CAO Q,ZHONG Y,et al.DearKD:data-efficient early knowledge distillation for vision transformers[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2022:12052-12062. [15] YUYAO G,YITING C,JIA W,et al.Vision transformer based on knowledge distillation in TCM image classification[C]//2022 IEEE 5th International Conference on Computer and Communication Engineering Technology(CCET),2022:120-125. [16] ZHANG L,WEN Y.A transformer-based framework for automatic COVID19 diagnosis in chest CTs[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision,2021:513-518. [17] HSU C C,CHEN G L,WU M H.Visual transformer with statistical test for covid-19 classification[J].arXiv:2107. 05334,2021. [18] LIN A,CHEN B,XU J,et al.DS-TransUNet:dual swin transformer u-net for medical image segmentation[J].arXiv:2106.06716,2021. [19] CAO H,WANG Y,CHEN J,et al.Swin-Unet:Unet-like pure transformer for medical image segmentation[J].arXiv:2105.05537,2021. [20] HATAMIZADEH A,NATH V,TANG Y,et al.Swin UNETR:swin transformers for semantic segmentation of brain tumors in MRI images[C]//International MICCAI Brainlesion Workshop.Cham:Springer,2022:272-284. [21] SIRINUKUNWATTANA K,PLUIM J P W,CHEN H,et al.Gland segmentation in colon histology images:the glas challenge contest[J].Medical Image Analysis,2017,35:489-502. [22] CODELLA N,ROTEMBERG V,TSCHANDL P,et al.Skin lesion analysis toward melanoma detection 2018:a challenge hosted by the international skin imaging collaboration(ISIC)[J].arXiv:1902.03368,2019. [23] FAN D P,JI G P,ZHOU T,et al.Pranet:parallel reverse attention network for polyp segmentation[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention.Cham:Springer,2020:263-273. [24] MENZE B H,JAKAB A,BAUER S,et al.The multimodal brain tumor image segmentation benchmark(BRATS)[J].IEEE Transactions on Medical Imaging,2014,34(10):1993-2024. [25] ZHOU Z,RAHMAN SIDDIQUEE M M,TAJBAKHSH N,et al.Unet++:a nested u-net architecture for medical image segmentation[M]//Deep learning in medical image analysis and multimodal learning for clinical decision support.Cham:Springer,2018:3-11. [26] HUANG H,LIN L,TONG R,et al.Unet 3+:a full-scale connected unet for medical image segmentation[C]//2020 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),2020:1055-1059. [27] OKTAY O,SCHLEMPER J,FOLGOC L L,et al.Attention u-net:learning where to look for the pancreas[J].arXiv:1804.03999,2018. [28] ISENSEE F,PETERSEN J,KLEIN A,et al.nnU-Net:self-adapting framework for u-net-based medical image segmentation[J].arXiv:1809.10486,2018. [29] VALANARASU J M J,OZA P,HACIHALILOGLU I,et al.Medical transformer:gated axial-attention for medical image segmentation[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention.Cham:Springer,2021:36-46. [30] ZHANG Y,HIGASHITA R,FU H,et al.A multi-branch hybrid transformer network for corneal endothelial cell segmentation[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention.Cham:Springer,2021:99-108. [31] JI Y,ZHANG R,WANG H,et al.Multi-compound transformer for accurate biomedical image segmentation[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention.Cham:Springer,2021:326-336. [32] WANG H,CAO P,WANG J,et al.UCTransNet:rethinking the skip connections in U-Net from a channel-wise perspective with transformer[J].arXiv:2109.04335,2021. [33] XU G,WU X,ZHANG X,et al.Levit-unet:make faster encoders with transformer for medical image segmentation[J].arXiv:2107.08623,2021. [34] GRAHAM B,EL-NOUBY A,TOUVRON H,et al.LeViT:a vision transformer in ConvNet’s clothing for faster inference[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision,2021:12259-12269. [35] CHEN J,LU Y,YU Q,et al.Transunet:transformers make strong encoders for medical image segmentation[J].arXiv:2102.04306,2021. [36] PETIT O,THOME N,RAMBOUR C,et al.U-net transformer:self and cross attention for medical image segmentation[C]//International Workshop on Machine Learning in Medical Imaging.Cham:Springer,2021:267-276. [37] CHANG Y,MENGHAN H,GUANGTAO Z,et al.Transclaw u-net:claw u-net with transformers for medical image segmentation[J].arXiv:2107.05188,2021. [38] GAO Y,ZHOU M,METAXAS D N.UTNet:a hybrid transformer architecture for medical image segmentation[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention.Cham:Springer,2021:61-71. [39] WANG H,XIE S,LIN L,et al.Mixed transformer u-net for medical image segmentation[C]//2022 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP),2022:2390-2394. [40] JI G P,CHOU Y C,FAN D P,et al.Progressively normalized self-attention network for video polyp segmentation[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention.Cham:Springer,2021:142-152. [41] LI S,SUI X,LUO X,et al.Medical image segmentation using squeeze-and-expansion transformers[J].arXiv:2105.09511,2021. [42] ZHANG Y,LIU H,HU Q.Transfuse:fusing transformers and CNNs for medical image segmentation[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention.Cham:Springer,2021:14-24. [43] CHEN B,LIU Y,ZHANG Z,et al.Transattunet:multi-level attention-guided u-net with transformer for medical image segmentation[J].arXiv:2107.05274,2021. [44] WANG J,WEI L,WANG L,et al.Boundary-aware transformers for skin lesion segmentation[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention.Cham:Springer,2021:206-216. [45] WU H,CHEN S,CHEN G,et al.FAT-Net:feature adaptive transformers for automated skin lesion segmentation[J].Medical Image Analysis,2022,76:102327. [46] HE X,TAN E L,BI H,et al.Fully transformer network for skin lesion analysis[J].Medical Image Analysis,2022:102357. [47] HATAMIZADEH A,TANG Y,NATH V,et al.Unetr:transformers for 3d medical image segmentation[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision,2022:574-584. [48] WANG W,CHEN C,DING M,et al.TransBTS:multimodal brain tumor segmentation using transformer[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention.Cham:Springer,2021:109-119. [49] SHOME D,KAR T,MOHANTY S N,et al.Covid-transformer:interpretable covid-19 detection using vision transformer for healthcare[J].International Journal of Environmental Research and Public Health,2021,18(21):11086. [50] SELVARAJU R R,COGSWELL M,DAS A,et al.Grad-cam:visual explanations from deep networks via gradient-based localization[C]//Proceedings of the IEEE International Conference on Computer Vision,2017:618-626. [51] GAO X,QIAN Y,GAO A.COVID-VIT:classification of COVID-19 from CT chest images based on vision transformer models[J].arXiv:2107.01682,2021. [52] HUANG G,LIU Z,VAN DER MAATEN L,et al.Densely connected convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2017:4700-4708. [53] PARK S,KIM G,OH Y,et al.Vision transformer for covid-19 cxr diagnosis using chest x-ray feature corpus[J].arXiv:2103.07055,2021. [54] CHEFER H,GUR S,WOLF L.Transformer interpretability beyond attention visualization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2021:782-791. [55] REY D,NEUH?USER M.Wilcoxon-signed-rank test[M]//International encyclopedia of statistical science.Berlin,Heidelberg:Springer,2011:1658-1659. [56] PARK S,KIM G,KIM J,et al.Federated split vision transformer for COVID-19CXR diagnosis using task-agnostic training[J].arXiv:2111.01338,2021. [57] PERERA S,ADHIKARI S,YILMAZ A.POCFormer:a lightweight transformer architecture for detection of COVID-19 using point of care ultrasound[C]//2021 IEEE International Conference on Image Processing(ICIP),2021:195-199. [58] WANG S,LI B Z,KHABSA M,et al.Linformer:self-attention with linear complexity[J].arXiv:2006.04768,2020. [59] SUN R,LI Y,ZHANG T,et al.Lesion-aware transformers for diabetic retinopathy grading[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2021:10938-10947. [60] DECENCIèRE E,ZHANG X,CAZUGUEL G,et al.Feedback on a publicly distributed image database:the Messidor database[J].Image Analysis & Stereology,2014,33(3):231-234. [61] YANG H,CHEN J,XU M.Fundus disease image classification based on improved transformer[C]//2021 International Conference on Neuromorphic Computing(ICNC),2021:207-214. [62] YU S,MA K,BI Q,et al.Mil-vt:multiple instance learning enhanced vision transformer for fundus image classification[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention.Cham:Springer,2021:45-54. [63] PACHADE S,PORWAL P,THULKAR D,et al.Retinal fundus multi-disease image dataset(RFMiD):a dataset for multi-disease detection research[J].Data,2021,6(2):14. [64] GHEFLATI B,RIVAZ H.Vision transformers for classification of breast ultrasound images[C]//2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society(EMBC),2022:480-483. [65] ISLAM M N,HASAN M,HOSSAIN M,et al.Vision transformer and explainable transfer learning models for auto detection of kidney cyst,stone and tumor from CT-radiography[J].Scientific Reports,2022,12(1):1-14. [66] QU X,LU H,TANG W,et al.A VGG attention vision transformer network for benign and malignant classification of breast ultrasound images[J].Medical Physics,2022,49(9):5787-5798. [67] KHAN A,LEE B.Gene transformer:transformers for the gene expression-based classification of lung cancer subtypes[J].arXiv:2108.11833,2021. [68] DUAN H,LIU Y,YAN H,et al.Fourier ViT:a multi-scale vision transformer with Fourier transform for histopathological image classification[C]//2022 7th International Conference on Automation,Control and Robotics Engineering(CACRE),2022:189-193. [69] ZHENG Y,GINDRA R H,GREEN E J,et al.A graph-transformer for whole slide image classification[J].arXiv:2205.09671,2022. [70] SHAO Z,BIAN H,CHEN Y,et al.Transmil:transformer based correlated multiple instance learning for whole slide image classification[C]//Advances in Neural Information Processing Systems,2021:2136-2147. [71] HE Z,LIN M,XU Z,et al.Deconv-transformer(DecT):a histopathological image classification model for breast cancer based on color deconvolution and transformer architecture[J].Information Sciences,2022,608:1093-1112. [72] CHEN J,HE Y,FREY E C,et al.Vit-v-net:vision transformer for unsupervised volumetric medical image registration[J].arXiv:2104.06468,2021. [73] CHEN J,FREY E C,HE Y,et al.Transmorph:transformer for unsupervised medical image registration[J].Medical Image Analysis,2022:102615. [74] ZHANG Y,PEI Y,ZHA H.Learning dual transformer network for diffeomorphic registration[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention.Cham:Springer,2021:129-138. [75] MARCUS D S,WANG T H,PARKER J,et al.Open access series of imaging studies(OASIS):cross-sectional MRI data in young,middle aged,nondemented,and demented older adults[J].Journal of Cognitive Neuroscience,2007,19(9):1498-1507. [76] MOK T C W,CHUNG A.Affine medical image registration with coarse-to-fine vision transformer[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2022:20835-20844. [77] SHATTUCK D W,MIRZA M,ADISETIYO V,et al.Construction of a 3D probabilistic atlas of human cortical structures[J].Neuroimage,2008,39(3):1064-1080. [78] SHI J,HE Y,KONG Y,et al.XMorpher:full transformer for deformable medical image registration via cross attention[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention.Cham:Springer,2022:217-226. [79] ZHUANG X,SHEN J.Multi-scale patch and multi-modality atlases for whole heart segmentation of MRI[J].Medical Image Analysis,2016,31:77-87. [80] XIE K,YANG Y,PAGNUCCO M,et al.Electron microscope image registration using Laplacian sharpening transformer U-Net[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention.Cham:Springer,2022:310-319. [81] CHEN J,LU D,ZHANG Y,et al.Deformer:towards displacement field learning for unsupervised medical image registration[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention.Cham:Springer,2022:141-151. [82] ZHU Y,LU S.Swin-VoxelMorph:a symmetric unsupervised learning model for deformable medical image regi- stration using swin transformer[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention.Cham:Springer,2022:78-87. [83] MUELLER S G,WEINER M W,THAL L J,et al.Ways toward an early diagnosis in Alzheimer’s disease:the Alzheimer’s disease neuroimaging initiative(ADNI)[J].Alzheimer’s & Dementia,2005,1(1):55-66. [84] MAREK K,JENNINGS D,LASCH S,et al.The Parkinson progression marker initiative(PPMI)[J].Progress in Neurobiology,2011,95(4):629-635. [85] SHAMSHAD F,KHAN S,ZAMIR S W,et al.Transformers in medical imaging:a survey[J].arXiv:2201.09873,2022. [86] PARVAIZ A,KHALID M A,ZAFAR R,et al.Vision transformers in medical computer vision--a contemplative retrospection[J].arXiv:2203.15269,2022. [87] HE K,GAN C,LI Z,et al.Transformers in medical image analysis:a review[J].arXiv:2202.12165,2022. |
[1] | 邓希泉, 陈刚. ConvUCaps:基于卷积胶囊网络的医学图像分割模型[J]. 计算机工程与应用, 2024, 60(8): 258-266. |
[2] | 呼伟, 徐巧枝, 葛湘巍, 于磊. 医学图像分割的无监督域适应研究综述[J]. 计算机工程与应用, 2024, 60(6): 10-26. |
[3] | 崔珂, 田启川, 廉露. 基于U-Net变体的医学图像分割算法综述[J]. 计算机工程与应用, 2024, 60(11): 32-49. |
[4] | 季瑞瑞, 谢宇辉, 骆丰凯, 梅远. 改进视觉Transformer的人脸识别方法[J]. 计算机工程与应用, 2023, 59(8): 117-126. |
[5] | 杨鹤, 柏正尧. CoT-TransUNet:轻量化的上下文Transformer医学图像分割网络[J]. 计算机工程与应用, 2023, 59(3): 218-225. |
[6] | 孙福艳, 王琼, 吕宗旺, 龚春艳. 深度学习在结肠息肉分割中的应用综述[J]. 计算机工程与应用, 2023, 59(23): 15-27. |
[7] | 高辉, 邓淼磊, 赵文君, 陈法权, 张德贤. 基于弱监督的改进Transformer在人群定位中的应用[J]. 计算机工程与应用, 2023, 59(19): 92-98. |
[8] | 王国力, 孙宇, 魏本征. 医学图像图深度学习分割算法综述[J]. 计算机工程与应用, 2022, 58(12): 37-50. |
[9] | 郭艳芬,崔喆,杨智鹏,彭静,胡金蓉. 基于深度学习的医学图像配准技术研究进展[J]. 计算机工程与应用, 2021, 57(15): 1-8. |
[10] | 孙利雷1,2,徐 勇3. 基于深度学习的乳腺X射线影像分类方法研究[J]. 计算机工程与应用, 2018, 54(21): 13-19. |
[11] | 郭树旭1,马树志1,李 晶2,张惠茅2,孙长建1,金兰依1,刘晓鸣1,刘奇楠1,李雪妍1. 基于全卷积神经网络的肝脏CT影像分割研究[J]. 计算机工程与应用, 2017, 53(18): 126-131. |
[12] | 张 翡,范 虹. 基于模糊C均值聚类的医学图像分割研究[J]. 计算机工程与应用, 2014, 50(4): 144-151. |
[13] | 王 丽,孙丰荣,王奕琨,刘 炜,姜 威,秦 通,李新彩. 基于互信息的颅脑MR影像序列的三维配准[J]. 计算机工程与应用, 2011, 47(31): 160-163. |
[14] | 张效娟1,刘 技2,王 昊2,3,刘玲玲2. Snake与多尺度分析的医学图像分割研究[J]. 计算机工程与应用, 2011, 47(18): 207-209. |
[15] | 朱圣权,赵海峰,罗 斌. 加权熵互信息在医学图像配准中的应用[J]. 计算机工程与应用, 2011, 47(16): 207-210. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||