Computer Engineering and Applications ›› 2018, Vol. 54 ›› Issue (13): 241-245.DOI: 10.3778/j.issn.1002-8331.1702-0014

Previous Articles     Next Articles

Prediction for fasting blood glucose level of health records based on KPCA-LSSVM

JIANG Yan1, SHUAI Renjun1, ZHANG Shu2, ZHA Daifeng3   

  1. 1.College of Computer Science and Technology, Nanjing Technology University, Nanjing 211816, China
    2.College of Electrical Engineering and Control Science , Nanjing Technology University,  Nanjing 211816, China
    3. College of Science , Jiujiang University, Jiujiang, Jiangxi 332000, China
  • Online:2018-07-01 Published:2018-07-17


江  燕1,帅仁俊1,张  姝2,查代奉3   

  1. 1.南京工业大学 计算机科学与技术学院,南京 211816
    2.南京工业大学 电气工程与控制科学学院,南京 211816
    3.九江学院 理学院,江西 九江 332000

Abstract: Diabetes, a preventable and controllable chronic disease, will cause a lot of complications, which is harmful to people’s health. Therefore, the early diagnosis of diabetes and the intervention of lifestyle are very necessary to the prevention of chronic diabetic complications. This paper makes use of the data of electronic health records to predict the level of fasting blood glucose, which is an important basis of early diagnosis and intervention. However, the data of health records have such features as multidimensional, noise interference, strong coupling and nonlinear. Therefore, this paper proposes a method based on the combination of KPCA and LSSVM, and makes a comparison of the three models which are LSSVM, PCA-LSSVM and KPCA-LSSVM. The results show that the accuracy is significantly improved through KPCA-LSSVM, the integral area of ROC curve is also close to 1, which proves the KPCA-LSSVM to be an appropriate method to the prediction of the level of fasting blood glucose. More importantly, it provides a new reference method for medical data mining.

Key words: fasting blood glucose, health record, Principal Component Analysis(PCA), Kernel Principal Component Analysis(KPCA), Least Squares Support Vector Machine(LSSVM)

摘要: 糖尿病是一种可防可控的慢性疾病,会产生很多并发症,对人体危害很大,因此早期诊断糖尿病并干预生活方式对预防糖尿病慢性并发症十分必要。利用健康档案中数据来预测空腹血糖水平,因为空腹血糖水平的高低是早期诊断和干预的一个重要依据,但是健康档案中数据存在维度广、噪声多、强耦合、非线性等特点,为此提出了基于KPCA和LSSVM结合的方法进行建模,并将LSSVM、PCA-LSSVM、KPCA-LSSVM这3种模型进行比较,结果表明KPCA-LSSVM准确性比LSSVM、PCA-LSSVM大幅提高,ROC曲线的积分面积也接近于1,说明KPCA-LSSVM能够运用于空腹血糖的预测,也为医疗数据挖掘提供一种新的参考办法。

关键词: 空腹血糖, 健康档案, 主成分分析(PCA), 核主成分分析(KPCA), 最小二乘向量机(LSSVM)