Computer Engineering and Applications ›› 2022, Vol. 58 ›› Issue (16): 219-225.DOI: 10.3778/j.issn.1002-8331.2101-0032

• Graphics and Image Processing • Previous Articles     Next Articles

Improved Mask R-CNN Method for Thyroid Nodules Segmentation in Ultrasound Images

LIU Mingkun, ZHANG Junhua, LI Zonggui   

  1. School of Information, Yunnan University, Kunming 650500, China
  • Online:2022-08-15 Published:2022-08-15

改进Mask R-CNN的甲状腺结节超声图像分割方法

刘明坤,张俊华,李宗桂   

  1. 云南大学 信息学院,昆明 650500

Abstract: The ultrasound image of thyroid nodules has low contrast and severe speckle noise. The morphology of thyroid nodules varies greatly from different patients, which makes it extremely difficult for doctors to accurately segment the nodules. In order to accurately segment thyroid nodules from ultrasound images, this paper improves the backbone network of Mask R-CNN(mask region-convolutional neural network). The attention mechanism module is added to the residual network layer of the original backbone network to improve the convergence of the model, and a branch from the bottom to the top is added to the feature pyramid network. After the branch output feature map is merged, it is input to the region proposal network and the region of interesting align to integrate multi-scale features while balancing the difference in feature map information. After testing 600 ultrasound images of thyroid nodules, the improved Mask R-CNN has an average Dice coefficient of 0.914 8, an average accuracy of 0.932 2, an average recall rate of 0.903 4, and an average F1 score of 0.917 6. The Dice coefficient is 0.080 6 higher than the original Mask R-CNN. The improved algorithm can be applied to automatically segment ultrasound images of thyroid nodules in actual clinical medicine.

Key words: thyroid nodule ultrasound image, Mask R-CNN, backbone network, image segmentation

摘要: 甲状腺结节超声图像对比度低,斑点噪声严重,且不同病人的甲状腺结节形态差异较大,这给医生准确分割结节带来极大困难。为了精确地从超声图像中分割出甲状腺结节,对原掩膜区域卷积神经网络(mask region-convolutional neural network,Mask R-CNN)的主干网络进行改进。在原主干网络的残差网络层中加入注意力机制模块来提高模型收敛性,并且在特征金字塔网络中增添一条由下向上的支路,将该支路输出特征图进行融合后,输入至区域推荐网络和感兴趣区域池化层,从而能够在融合多尺度特征的同时平衡特征图信息差异。经过对600幅甲状腺结节超声图像进行测试,改进后Mask R-CNN图像分割的平均Dice系数为0.914?8,平均精确度为0.932?2,平均召回率为0.903?4,平均F1分数为0.917?6。改进算法分割的Dice系数比原Mask R-CNN提升了0.080?6,改进算法可以应用于实际临床医学中自动分割甲状腺结节超声图像。

关键词: 甲状腺结节超声图像, Mask R-CNN, 主干网络, 图像分割