计算机工程与应用 ›› 2018, Vol. 54 ›› Issue (3): 250-254.DOI: 10.3778/j.issn.1002-8331.1608-0441

• 工程与应用 • 上一篇    下一篇

基于ROSE和C5.0算法的打鼾者OSAHS初筛模型

杜国栋1,吕云辉2,马  磊1,相  艳1,邵党国1,雷  强1,胡  蓉1   

  1. 1.昆明理工大学  信息工程与自动化学院,昆明 650500
    2.云南省第一人民医院 呼吸科,昆明 650032
  • 出版日期:2018-02-01 发布日期:2018-02-07

Brief modeling study of OSAHS patients screening from snoring persons based on ROSE and C5.0 algorithm

DU Guodong1, LV Yunhui2, MA Lei1, XIANG Yan1, SHAO Dangguo1, LEI Qiang1, HU Rong1   

  1. 1.Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
    2.Department of Respiratory Medicine, First People’s Hospital of Yunnan Province, Kunming 650032, China
  • Online:2018-02-01 Published:2018-02-07

摘要: 使用医疗信息系统的数据进行睡眠呼吸暂停低通气综合征(OSAHS)预测和分析过程中,存在不平衡数据问题。为此,在现有临床研究的基础上,提出了一种基于ROSE(Random Over Sampling Examples)和C5.0算法的初筛模型。利用收集到的人体测量学指标数据,通过数据预处理,删除异常值并填补缺失值。然后采用ROSE算法对数据进行平衡,利用C5.0分类器对平衡后的数据构建筛查模型,通过十则交叉验证的方法检验模型的筛查效果。实验结果表明,使用该模型进行打鼾患者的OSAHS筛查,可以有效地提高筛查效率。

关键词: 不均衡数据, 初筛模型, 随机过采样(ROSE), C5.0决策树

Abstract: Aiming at the issue that imbalanced data will exist in the snoring patients with Obstructive Sleep Apnea-Hypopnea Syndrome(OSAHS) prediction and analysis by using the data from medical information system. Therefore, on the basis of the clinical research, a screening model based on Random Over-Sampling Examples(ROSE) and C5.0 decision tree algorithm is established. Firstly, to preprocess the collected anthropometry data, remove outliers and fill in the missing value. Then, to apply the ROSE algorithm to balance the preprocessing data. Next, C5.0 classifier is applied to construct the screening model based on the balanced data. Finally, to evaluate the screening effect through 10-fold cross validation. The results demonstrate that this model can effectively improve the screening efficiency of snoring patients with OSAHS.

Key words: imbalanced data, screening model, Random Over Sampling Examples(ROSE), C5.0 decision tree