计算机工程与应用 ›› 2023, Vol. 59 ›› Issue (3): 218-225.DOI: 10.3778/j.issn.1002-8331.2205-0046

• 图形图像处理 • 上一篇    下一篇

CoT-TransUNet:轻量化的上下文Transformer医学图像分割网络

杨鹤,柏正尧   

  1. 云南大学 信息学院,昆明 650500
  • 出版日期:2023-02-01 发布日期:2023-02-01

CoT-TransUNet:Lightweight Context Transformer Medical Image Segmentation Network

YANG He, BAI Zhengyao   

  1. School of Information Institute, Yunnan University, Kunming 650500, China
  • Online:2023-02-01 Published:2023-02-01

摘要: 针对以往医学图像分割网络中卷积的感受野太小以及Transformer的特征丢失问题,提出了一种端到端的轻量化上下文Transformer医学图像分割网络(lightweight context Transformer medical image segmentation network,CoT-TransUNet)。该网络由编码器、解码器以及跳跃连接三部分组成。对于输入图像,编码器使用CoTNet-Transformer的混合模块,采用CoTNet作为特征提取器来生成特征图。Transformer块则把特征图编码为输入序列。解码器通过一个级联上采样器,将编码后的特征进行上采样。该上采样器级联了多个上采样块,每个上采样块都采用CARAFE上采样算子。通过跳跃连接实现编码器与解码器在不同分辨率上的特征聚合。CoT-TransUNet通过在特征提取阶段采用全局与局部上下文信息相结合的CoTNet;在上采样阶段采用具有更大感受野的CARAFE算子。实现了生成更好的输入特征图,以及基于内容的上采样,并保持轻量化。在多器官分割任务的实验中,CoT-TransUNet取得了优于其他网络的性能。

关键词: 医学图像分割, 上下文Transformer网络, 级联上采样器, 轻量化

Abstract: Aiming at the problem that the receptive field of convolution in the previous medical image segmentation network is too small and the feature loss of Transformer, an end-to-end lightweight context Transformer medicalimage segmentation network(lightweight context Transformer medical image segmentation network, CoT-TransUNet) is proposed. The network consists of three parts:encoder,decoder, and skip connections. For the input image,the encoder uses the CoTNet as a feature extractor to generate feature maps. Transformer blocks encode feature maps as input sequences. Then, the decoder upsamples the encoded features through a cascaded upsampler. The upsampler cascades multiple upsampling blocks, each of which employs the CARAFE upsampling operator. Finally, feature aggregation of the encoder and decoder at different resolutions is achieved through skip connections. CoT-TransUNet adopts CoTNet which combines global and local context information in the feature extraction stage. CARAFE operator with larger receptive field is adopted in the upsampling stage. It generates better input feature maps, as well as content-based upsampling, while remaining lightweight. Experiments on multi-organ segmentation tasks show that CoT-TransUNet achieves better performance than other networks.

Key words: medical image segmentation, context Transformer network, cascaded upsampler, lightweight