Computer Engineering and Applications ›› 2020, Vol. 56 ›› Issue (17): 136-141.DOI: 10.3778/j.issn.1002-8331.1908-0337

Previous Articles     Next Articles

XLC-Stacking Method for Disease Diagnosis Based on XGBoost Feature Selection

YUE Peng, HOU Lingyan, YANG Dali, TONG Qiang   

  1. Open Computer System Laboratory, Beijing Information Science and Technology University, Beijing 100101, China
  • Online:2020-09-01 Published:2020-08-31



  1. 北京信息科技大学 计算机开放系统实验室,北京 100101


Aiming at the problem of feature redundancy in medical disease data, XGBoost feature selection method is used to measure feature importance, delete redundant features, and select the best classification features. For the problem of low recognition accuracy, Stacking method is used to integrate XGBoost, LightGBM and other heterogeneous classifiers, and a better CatBoost classifier is introduced into the heterogeneous classifier to improve the classification accuracy of the integrated classifier. To avoid overfitting, the classification probability of the output of the base classifier is chosen as the high level classifier input. Experimental results show that the XLC-Stacking method based on XGBoost feature selection is greatly improved compared with the current mainstream classification algorithm and the single XGBoost algorithm and Stacking method. The accuracy of recognition and F1-Score reach 97.73% and 98.21%, which is even more suitable for the diagnosis of disease.

Key words: disease diagnosis, feature selection, XGBoost, CatBoost, Stacking



关键词: 疾病诊断, 特征选择, XGBoost, CatBoost, Stacking