Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (24): 205-211.DOI: 10.3778/j.issn.1002-8331.2008-0420

• Pattern Recognition and Artificial Intelligence • Previous Articles     Next Articles

Application of Pathological Image Texture Analysis in MSI Prediction of Gastric Cancer

AN Weichao, YAN Ting, ZHANG Nan, ZHANG Shan, XIANG Jie, CAO Rui, WANG Bin   

  1. 1.College of Information and Computer, Taiyuan University of Technology, Jinzhong, Shanxi 030600, China
    2.Translational Medicine Research Center, Shanxi Medical University, Jinzhong, Shanxi 030600, China
  • Online:2021-12-15 Published:2021-12-13



  1. 1.太原理工大学 信息与计算机学院,山西 晋中 030600
    2.山西医科大学 转化医学研究中心,山西 晋中 030600


Microsatellites are short strings of repeated sequences scattered throughout the human genome. Microsatellite Instability(MSI) is a phenomenon in which the length of microsatellites changes due to the insertion or deletion of repeated units in tumor tissues. MSI type gastric cancer often has unique molecular phenotypes and clinicopathological characteristics, and the instability of microsatellites determines whether gastric cancer patients respond well to immunotherapy. Therefore, preoperative detection of MSI status is of great significance for the formulation of treatment plans for gastric cancer patients. Traditional MSI detection methods require immunohistochemistry and genetic analysis, which not only require additional costs, but also are difficult to be extended to every patient in clinical practice. In this paper, image feature extraction technology and machine learning algorithm are applied to quantitative analysis of high-resolution histopathological images of gastric cancer patients to predict the MSI status of gastric cancer patients. The original data of 279 cases are obtained from the TCGA database. After pre-processing and up-sampling, 442 samples are obtained, and 445 quantitative image features are extracted from the histopathological images of each sample, including the first-order statistics, texture features and small wave characteristics of the images. Lasso regression is used to screen features and construct predictive labels(Risk-score) of gastric cancer MSI status, and the performance of predictive labels is verified through logistics classification model. Then, multivariate analysis is carried out in combination with the clinical characteristics of each patient, and personalized train diagram is constructed for MSI status prediction. The experimental results show that the prediction performance AUC value of the prediction label based on histone image texture features is 0.74, and the AUC value of the existing MSI prediction model based on histone image texture features is 0.73. Based on all samples, the AUC value of the MSI prediction model constructed by combining clinical features and Risk-score is 0.802, while the AUC value of the existing MSI prediction model combining clinical features and image features is only 0.752, compared with the existing methods, the MSI prediction model proposed in this paper has better prediction performance and can provide more valuable reference information for the clinical decision-making of gastric cancer patients.

Key words: machine learning, histopathological images, MSI prediction, texture features, gastric cancer


微卫星是遍布于人类基因组中的短串重复序列,肿瘤组织的微卫星由于重复单位的插入或缺失而导致微卫星长度的改变的现象叫做微卫星不稳定性(Microsatellite Instability,MSI)。MSI型胃癌往往拥有独特的分子表型以及临床病理特征,且微卫星的不稳定性决定了胃癌患者对免疫疗法的反应是否良好,因此MSI状态的术前检测对于胃癌患者治疗方案的制定具有重要意义。传统的MSI检测方法需要进行免疫组化及基因分析,不仅需要增加额外的成本,而且在临床实践中难以推广至每一个患者。应用图像特征提取技术和机器学习算法对胃癌患者的高分辨组织病理图像进行定量分析,实现对胃癌患者MSI状态的预测。从TCGA数据库获取279例原始数据,经预处理和上采样后得到442个样本,从每例样本的组织病理图像中提取出445个定量图像特征,包括图像的一阶统计量,纹理特征以及小波特征。应用Lasso回归进行特征筛选并构造胃癌MSI状态的预测标签(Risk-score),并通过logistics分类模型对预测标签分类性能进行验证,进而结合每例患者的临床特征进行多变量分析,构建个性化的列线图进行MSI状态预测。实验结果显示,基于组织病理图像纹理特征的预测标签的预测性能AUC值为0.74,现有的基于组织病理图像纹理特征的MSI预测模型AUC值为0.73;基于全部样本,结合临床特征与Risk-score构建的MSI预测模型AUC值为0.802,而现有的结合临床特征和图像特征的MSI预测模型的AUC值仅为0.752,相较于现有方法,提出的MSI预测模型具有更优的预测性能,可以为胃癌患者的临床决策提供更有价值的参考信息。

关键词: 机器学习, 组织病理图像, MSI预测, 纹理特征, 胃癌