计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (17): 1-16.DOI: 10.3778/j.issn.1002-8331.2401-0136
孙兴,蔡肖红,李明,张帅,马金刚
出版日期:
2024-09-01
发布日期:
2024-08-30
SUN Xing, CAI Xiaohong, LI Ming, ZHANG Shuai, MA Jingang
Online:
2024-09-01
Published:
2024-08-30
摘要: 随着大模型技术的不断发展,以分割一切模型(segment anything model,SAM)为代表的视觉大模型在图像分割领域取得重要突破。SAM通过提示驱动完成一系列下游分割任务,旨在统一解决所有的图像分割问题。因此,将SAM应用于医学图像分割具有重要意义,其泛化性能够适应多种医学图像,为医生提供更全面的解剖结构和病变信息。介绍了图像分割常用的数据集;对SAM的网络结构和泛化性进行细致阐述;重点对SAM应用在全切片成像、磁共振成像、计算机断层扫描、超声和多模态图像的五大类医学图像进行梳理分析,总结优缺点和相应的改进方法;结合当前医学图像分割领域中存在的实际问题,讨论并展望了SAM未来的发展方向。
孙兴, 蔡肖红, 李明, 张帅, 马金刚. 视觉大模型SAM在医学图像分割中的应用综述[J]. 计算机工程与应用, 2024, 60(17): 1-16.
SUN Xing, CAI Xiaohong, LI Ming, ZHANG Shuai, MA Jingang. Review of Application of Visual Foundation Model SAM in Medical Image Segmentation[J]. Computer Engineering and Applications, 2024, 60(17): 1-16.
[1] RITTER F, BOSKAMP T, HOMEYER A, et al. Medical image analysis[J]. IEEE Pulse, 2011, 2(6): 60-70. [2] BUSHBERG J T, BOONE J M. The essential physics of medical imaging[M]. Lippincott Williams & Wilkins, 2011. [3] 梁芳烜, 杨锋, 卢丽云, 等. 基于卷积神经网络的脑肿瘤分割方法综述[J]. 计算机工程与应用, 2021, 57(7): 34-43. LIANG F X, YANG F, LU L Y, et al. Review of brain tumor segmentation methods based on convolutional neural networks[J]. Computer Engineering and Applications, 2021, 57(7): 34-43. [4] SHERSTINSKY A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network[J]. Physica D: Nonlinear Phenomena, 2020, 404: 132306. [5] 钟思华, 郭兴明, 郑伊能. 改进U-Net网络的肺结节分割方法[J]. 计算机工程与应用, 2020, 56(17): 203-209. ZHONG S H, GUO X M, ZHENG Y N. Improved U-Net network for lung nodule segmentation[J]. Computer Engineering and Applications, 2020, 56(17): 203-209. [6] YUAN Y. On the power of foundation models[C]//Proceedings of the International Conference on Machine Learning, 2023: 40519-40530. [7] LüDDECKE T, ECKER A. Image segmentation using text and image prompts[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 7086-7096. [8] SHAH D, SRIDHAR A, DASHORA N, et al. ViNT: a foundation model for visual navigation[C]//Proceedings of the Conference on Robot Learning, 2023: 711-733. [9] WANG W, DAI J, CHEN Z, et al. Internimage: exploring large-scale vision foundation models with deformable convolutions[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 14408-14419. [10] LU P, BANSAL H, XIA T, et al. MathVista: evaluating mathematical reasoning of foundation models in visual contexts[C]//Proceedings of the 12th International Conference on Learning Representations, 2023. [11] KIRILLOV A, MINTUN E, RAVI N, et al. Segment anything[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023: 4015-4026. [12] THIRUNAVUKARASU A J, TING D S J, ELANGOVAN K, et al. Large language models in medicine[J]. Nature Medicine, 2023, 29(8): 1930-1940. [13] RAJPURKAR P, CHEN E, BANERJEE O, et al. AI in health and medicine[J]. Nature Medicine, 2022, 28(1): 31-38. [14] HAMET P, TREMBLAY J. Artificial intelligence in medicine[J]. Metabolism, 2017, 69: S36-S40. [15] ZENG W, REN X, SU T, et al. Pangu-α: large-scale autoregressive pretrained chinese language models with auto-parallel computation[J]. arXiv:2104.12369, 2021. [16] ZHANG L, DENG X, LU Y. Segment anything model (SAM) for medical image segmentation: a preliminary review[C]//Proceedings of the 2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2023: 4187-4194. [17] ZHANG Y, JIAO R. How segment anything model (SAM) boost medical image segmentation: a survey[J]. arXiv:2305. 03678, 2023. [18] KUZNETSOVA A, ROM H, ALLDRIN N, et al. The open images dataset v4: unified image classification, object detection, and visual relationship detection at scale[J]. International Journal of Computer Vision, 2020, 128(7): 1956-1981. [19] YE J, CHENG J, CHEN J, et al. SA-Med2D-20M dataset: segment anything in 2D medical imaging with 20 million masks[J]. arXiv:2311.11969, 2023. [20] HUANG Y, YANG X, LIU L, et al. Segment anything model for medical images?[J]. Medical Image Analysis, 2024, 92: 103061. [21] ADEWOLE M, RUDIE J D, GBDAMOSI A, et al. The brain tumor segmentation (BraTS) challenge 2023: glioma segmentation in Sub-Saharan Africa patient population (BraTS-Africa)[J]. arXiv:2305.19369, 2023. [22] LECLERC S, SMISTAD E, PEDROSA J, et al. Deep learning for segmentation using an open large-scale dataset in 2D echocardiography[J]. ?IEEE Transactions on Medical Imaging, 2019, 38(9): 2198-2210. ? [23] YANG J, SHI R, WEI D, et al. MedMNIST v2: a large-scale lightweight benchmark for 2D and 3D biomedical image classification[J]. Scientific Data, 2023, 10(1): 41. [24] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems, 2017. [25] HE K, CHEN X, XIE S, et al. Masked autoencoders are scalable vision learners[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 16000-16009. [26] WANG J, CHAN K C K, LOY C C. Exploring clip for assessing the look and feel of images[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2023: 2555-2563. [27] 王波, 李梦翔, 刘侠. 基于改进U-Net网络的甲状腺结节超声图像分割方法[J]. 电子与信息学报, 2022, 44(2): 514-522. WANG B, LI M X, LIU X. Ultrasound image segmentation method of thyroid nodules based on the improved U-Net network[J]. Journal of Electronics and Information Technology, 2022, 44(2): 514-522. [28] HUANG Y, CAO Y, LI T, et al. On the robustness of segment anything[J]. arXiv:2305.16220, 2023. [29] MUSA A, VISHI K, REXHA B. Attack analysis of face recognition authentication systems using fast gradient sign method[J]. Applied Artificial Intelligence, 2021, 35(15): 1346-1360. [30] GUPTA H, JIN K H, NGUYEN H Q, et al. CNN-based projected gradient descent for consistent CT image reconstruction[J]. IEEE Transactions on Medical Imaging, 2018, 37(6): 1440-1453. [31] QIAO Y, ZHANG C, KANG T, et al. Robustness of sam: segment anything under corruptions and beyond[J]. arXiv:2306.07713, 2023. [32] HUANG X, BELONGIE S. Arbitrary style transfer in real-time with adaptive instance normalization[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 1501-1510. [33] CHEN T, MAI Z, LI R, et al. Segment anything model (SAM) enhanced pseudo labels for weakly supervised semantic segmentation[J]. arXiv:2305.05803, 2023. [34] JIANG P T, ZHANG C B, HOU Q, et al. LayerCAM: exploring hierarchical class activation maps for localization[J]. IEEE Transactions on Image Processing, 2021, 30: 5875-5888. [35] FARAHANI N, PARWANI A V, PANTANOWITZ L. Whole slide imaging in pathology: advantages, limitations, and emerging perspectives[J]. Pathology and Laboratory Medicine International, 2015, 7: 23-33. [36] LI X, DENG R, TANG Y, et al. Leverage weekly annotation to pixel-wise annotation via zero-shot segment anything model for molecular-empowered learning[C]//Proceedings of the Medical Imaging 2024: Digital and Computational Pathology, 2024: 133-139. [37] DENG R, CUI C, LIU Q, et al. Segment anything model (SAM) for digital pathology: assess zero-shot segmentation on whole slide imaging[J]. arXiv:2304.04155, 2023. [38] HIEBER D, KAETHAN M, HOLL F, et al. Evaluating the segment anything model for histopathological tissue segmentation[C]//Proceedings of the German Medical Science, 2023. [39] H?RST F, REMPE M, HEINE L, et al. CellViT: vision transformers for precise cell segmentation and classification[J]. Medical Image Analysis, 2024, 94: 103143. [40] GRAHAM S, VU Q D, RAZA S E A, et al. HoVer-Net: simultaneous segmentation and classification of nuclei in multi-tissue histology images[J]. Medical Image Analysis, 2019, 58: 101563. [41] SHAHARABANY T, DAHAN A, GIRYES R, et al. AutoSAM: adapting sam to medical images by overloading the prompt encoder[J]. arXiv:2306.06370, 2023. [42] ZHANG J, MA K, KAPSE S, et al. SAM-Path: a segment anything model for semantic segmentation in digital pathology[C]//Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, 2023: 161-170. [43] KATTI G, ARA S A, SHIREEN A. Magnetic resonance imaging (MRI)—a review[J]. International Journal of Dental Clinics, 2011, 3(1): 65-70. [44] PUTZ F, GRIGO J, WEISSMANN T, et al. The segment anything foundation model achieves favorable brain tumor autosegmentation accuracy on MRI to support radiotherapy treatment planning[J]. arXiv:2304.07875, 2023. [45] ZHANG P, WANG Y. Segment anything model for brain tumor segmentation[J]. arXiv:2309.08434, 2023. [46] PSYCHOGYIOS K, LELIGOU H C, MELISSARI F, et al. SAMStyler: enhancing visual creativity with neural style transfer and segment anything model (SAM)[J]. IEEE Access, 2023, 11: 100256-100267. [47] LI Y, WANG D, YUAN C, et al. Enhancing agricultural image segmentation with an agricultural segment anything model adapter[J]. Sensors, 2023, 23(18): 7884. [48] ZHANG W, WANG Y, SHEN G, et al. Tobacco leaf segmentation based on improved MASK RCNN algorithm and SAM model[J]. IEEE Access, 2023, 11: 103102-103114. [49] LI Y, JING B, FENG X, et al. nnSAM: plug-and-play segment anything model improves nnunet performance[J]. arXiv:2309.16967, 2023. [50] ISENSEE F, JAEGER P F, KOHL S A A, et al. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation[J]. Nature Methods, 2021, 18(2): 203-211. [51] LI N, XIONG L, QIU W, et al. Segment anything model for semi-supervised medical image segmentation via selecting reliable pseudo-labels[C]//Proceedings of the International Conference on Neural Information Processing, 2023: 138-149. [52] SMITH S M. Fast robust automated brain extraction[J]. Human Brain Mapping, 2002, 17(3): 143-155. [53] MOHAPATRA S, GOSAI A, SCHLAUG G. SAM vs BET: a comparative study for brain extraction and segmentation of magnetic resonance images using deep learning[J]. arXiv:2304.04738, 2023. [54] BUZUG T M. Computed tomography[M]//Springer handbook of medical technology. Berlin, Heidelberg: Springer, 2011: 311-342. [55] ZHANG K, LIU D. Customized segment anything model for medical image segmentation[J]. arXiv:2304.13785, 2023. [56] LIN A, CHEN B, XU J, et al. DS-TransUNet: dual swin transformer U-Net for medical image segmentation[J]. IEEE Transactions on Instrumentation and Measurement, 2022, 71: 1-15. [57] HU E J, WALLIS P, ALLEN-ZHU Z, et al. LoRA: low-rank adaptation of large language models[C]//Proceedings of the International Conference on Learning Representations, 2022. [58] WANG L, MA C, FENG X, et al. A survey on large language model based autonomous agents[J]. Frontiers of Computer Science, 2024, 18(6): 1-26. [59] HUANG X, DENG Z, LI D, et al. MISSFormer: an effective transformer for 2D medical image segmentation[J]. IEEE Transactions on Medical Imaging, 2023, 42(5): 1484-1494. [60] FENG W, ZHU L, YU L. Cheap lunch for medical image segmentation by fine-tuning SAM on few exemplars[J]. arXiv:2308.14133, 2023. [61] YUE W, ZHANG J, HU K, et al. Surgicalsam: efficient class promptable surgical instrument segmentation[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2024: 6890-6898. [62] NGUYEN H H, NGUYEN C T, TRAN M T. Volumetric CT segmentation with mask propagation using segment anything[C]//Proceedings of the 12th International Symposium on Information and Communication Technology, 2023: 623-630. [63] KONG L, HUANG M, ZHANG L, et al. Enhancing diagnostic images to improve the performance of the segment anything model in medical image segmentation[J]. Bioengineering, 2024, 11(3): 270. [64] SINGH P, SHANKAR A. A novel optical image denoising technique using convolutional neural network and anisotropic diffusion for real-time surveillance applications[J]. Journal of Real-Time Image Processing, 2021, 18(5): 1711-1728. [65] ZHANG Y, SHEN Z, JIAO R. Segment anything model for medical image segmentation: current applications and future directions[J]. Computers in Biology and Medicine, 2024, 171: 108238. [66] LAUGIER P, HA?AT G. Introduction to the physics of ultrasound[J]. Bone Quantitative Ultrasound, 2011: 29-45. [67] CHEN F, CHEN L, HAN H, et al. The ability of segmenting anything model (SAM) to segment ultrasound images[J]. BioScience Trends, 2023. [68] NING G, LIANG H, JIANG Z, et al. The potential of 'segment anything' (SAM) for universal intelligent ultrasound image guidance[J]. BioScience Trends, 2023. [69] MATTJIE C, DE MOURA L V, RAVAZIO R, et al. Zero-shot performance of the segment anything model (SAM) in 2D medical imaging: a comprehensive evaluation and practical guidelines[C]//Proceedings of the IEEE 23rd International Conference on Bioinformatics and Bioengineering (BIBE), 2023: 108-112. [70] JIANG X, MA J, XIAO G, et al. A review of multimodal image matching: methods and applications[J]. Information Fusion, 2021, 73: 22-71. [71] MAZUROWSKI M A, DONG H, GU H, et al. Segment anything model for medical image analysis: an experimental study[J]. Medical Image Analysis, 2023, 89: 102918. [72] SHI P, QIU J, ABAXI S M D, et al. Generalist vision foundation models for medical imaging: a case study of segment anything model on zero-shot medical segmentation[J]. Diagnostics, 2023, 13(11): 1947. [73] LLUGSI R, EL YACOUBI S, FONTAINE A, et al. Comparison between Adam, AdaMax and Adam W optimizers to implement a weather forecast based on neural networks for the Andean city of Quito[C]//Proceedings of the IEEE 5th Ecuador Technical Chapters Meeting (ETCM), 2021: 1-6. [74] HAN K, WANG Y, CHEN H, et al. A survey on vision transformer[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 45(1): 87-110. [75] CHANG Y, WANG X, WANG J, et al. A survey on evaluation of large language models[J]. ACM Transactions on Intelligent Systems and Technology, 2023, 15(3): 1-45. [76] CHENG J, YE J, DENG Z, et al. SAM-Med2D[J]. arXiv:2308.16184, 2023. [77] CHEN S, GE C, TONG Z, et al. Adaptformer: adapting vision transformers for scalable visual recognition[C]//Advances in Neural Information Processing Systems, 2022: 16664-16678. [78] SUN J, CHEN K, HE Z, et al. Medical image analysis using improved SAM-Med2D: segmentation and classification perspectives[J]. BMC Medical Imaging, 2024. [79] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 7132-7141. [80] MA J, HE Y, LI F, et al. Segment anything in medical images[J]. Nature Communications, 2024, 15(1): 654. [81] AZAD R, ASADI-AGHBOLAGHI M, FATHY M, et al. Attention DeepLabV3+: multi-level context attention mechanism for skin lesion segmentation[C]//Proceedings of the European Conference on Computer Vision. Cham: Springer International Publishing, 2020: 251-266. [82] LI K, RAJPURKAR P. Adapting segment anything models to medical imaging via fine-tuning without domain pretraining[C]//Proceedings of the AAAI 2024 Spring Symposium on Clinical Foundation Models, 2024. [83] LIU Z, MAO H, WU C Y, et al. A ConvNet for the 2020s[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 11976-11986. [84] QUAN Q, TANG F, XU Z, et al. Slide-SAM: medical SAM meets sliding window[C]//Proceedings of the Medical Imaging with Deep Learning, 2024. [85] ZHANG S, METAXAS D. On the challenges and perspectives of foundation models for medical image analysis[J]. Medical Image Analysis, 2023, 91: 102996. [86] WANG H, GUO S, YE J, et al. SAM-Med3D[J]. arXiv:2310.15161, 2023. [87] LIU J, WANG Y, JU C, et al. Annotation-free audio-visual segmentation[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024: 5604-5614. [88] XIONG X, WANG C, LI W, et al. Mammo-SAM: adapting foundation segment anything model for automatic breast mass segmentation in whole mammograms[C]//Proceedings of the International Workshop on Machine Learning in Medical Imaging, 2023: 176-185. [89] FAZEKAS B, MORANO J, LACHINOV D, et al. Adapting segment anything model (SAM) for retinal OCT[C]//Proceedings of the International Workshop on Ophthalmic Medical Image Analysis, 2023: 92-101. [90] RAMESH D B, IYTHA SRIDHAR R, UPADHYAYA P, et al. Lung grounded-SAM (LuGSAM): a novel framework for integrating text prompts to segment anything model (SAM) for segmentation tasks of ICU chest X-rays[J]. Authorea Preprints, 2023. [91] GARG A, MAGO V. Role of machine learning in medical research: a survey[J]. Computer Science Review, 2021, 40: 100370. [92] SHEN D, WU G, SUK H I. Deep learning in medical image analysis[J]. Annual Review of Biomedical Engineering, 2017, 19: 221-248. [93] LITJENS G, KOOI T, BEJNORDI B E, et al. A survey on deep learning in medical image analysis[J]. Medical Image Analysis, 2017, 42: 60-88. [94] MOOR M, BANERJEE O, ABAD Z S H, et al. Foundation models for generalist medical artificial intelligence[J]. Nature, 2023, 616: 259-265. [95] WANG D, WANG X, WANG L, et al. A real-world dataset and benchmark for foundation model adaptation in medical image classification[J]. Scientific Data, 2023, 10(1): 574. [96] ZHOU Y, CHIA M A, WAGNER S K, et al. A foundation model for generalizable disease detection from retinal images[J]. Nature, 2023, 622: 156-163. [97] CHENG Y, WANG D, ZHOU P, et al. Model compression and acceleration for deep neural networks: the principles, progress, and challenges[J]. IEEE Signal Processing Magazine, 2018, 35(1): 126-136. [98] BECKMANN D, KOCKWELP J, GROMOLL J, et al. SAM meets gaze: passive eye tracking for prompt-based instance segmentation[C]//Proceedings of the NeuRIPS 2023 Workshop on Gaze Meets ML, 2023: 21-39. [99] MISHRA S, STURM B L, DIXON S. Local interpretable model-agnostic explanations for music content analysis[J]. Proceedings of the ISMIR, 2017, 53: 537-543. [100] HUANG Z, BIANCHI F, YUKSEKGONUL M, et al. A visual-language foundation model for pathology image analysis using medical Twitter[J]. Nature Medicine, 2023, 29(9): 2307-2316. |
[1] | 邓希泉, 陈刚. ConvUCaps:基于卷积胶囊网络的医学图像分割模型[J]. 计算机工程与应用, 2024, 60(8): 258-266. |
[2] | 呼伟, 徐巧枝, 葛湘巍, 于磊. 医学图像分割的无监督域适应研究综述[J]. 计算机工程与应用, 2024, 60(6): 10-26. |
[3] | 陈丽芳, 罗世勇. 融合卷积和Transformer的多尺度肝肿瘤分割方法[J]. 计算机工程与应用, 2024, 60(4): 270-279. |
[4] | 牛亮, 张孟璐, 陈炳华, 姜舒, 陆璐琦, 徐晓, 牛强. 面向类别不平衡的胎儿心脏超声图像分割算法[J]. 计算机工程与应用, 2024, 60(21): 236-243. |
[5] | 魏哲亮, 李岳阳, 罗海驰. 多尺度池化和双向特征融合的场景文本检测[J]. 计算机工程与应用, 2024, 60(2): 154-161. |
[6] | 张震, 彭景昊, 田鸿朋. 考虑数据分布损失的图像分割[J]. 计算机工程与应用, 2024, 60(19): 242-249. |
[7] | 廖文涛, 徐国平, 吴兴隆, 张炫, 周华兵. 融合域自适应网络和多尺度特征聚合的息肉分割网络[J]. 计算机工程与应用, 2024, 60(18): 239-247. |
[8] | 周华平, 邓彬. 融合多层次特征的DeepLabv3+轻量级图像分割算法[J]. 计算机工程与应用, 2024, 60(16): 269-275. |
[9] | 孔韦韦. SML与侧窗滤波的Framelet域医学图像融合方法[J]. 计算机工程与应用, 2024, 60(13): 237-245. |
[10] | 崔珂, 田启川, 廉露. 基于U-Net变体的医学图像分割算法综述[J]. 计算机工程与应用, 2024, 60(11): 32-49. |
[11] | 石磊, 籍庆余, 陈清威, 赵恒毅, 张俊星. 视觉Transformer在医学图像分析中的应用研究综述[J]. 计算机工程与应用, 2023, 59(8): 41-55. |
[12] | 王波, 袁凤强, 陈宗仁, 胡建华, 杨家慧, 刘侠. 多阶U-Net甲状腺超声图像自动分割方法[J]. 计算机工程与应用, 2023, 59(5): 205-212. |
[13] | 刘国奇, 蒋优, 常宝方, 茹琳媛, 宋一帆, 李旭升. 融合显著性特征的自适应主动轮廓模型[J]. 计算机工程与应用, 2023, 59(5): 312-320. |
[14] | 杨鹤, 柏正尧. CoT-TransUNet:轻量化的上下文Transformer医学图像分割网络[J]. 计算机工程与应用, 2023, 59(3): 218-225. |
[15] | 孙福艳, 王琼, 吕宗旺, 龚春艳. 深度学习在结肠息肉分割中的应用综述[J]. 计算机工程与应用, 2023, 59(23): 15-27. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||