计算机工程与应用 ›› 2018, Vol. 54 ›› Issue (8): 260-264.DOI: 10.3778/j.issn.1002-8331.1709-0336

• 工程与应用 • 上一篇    下一篇

面向疾病相关关联抽取的深度语义特征研究

康旭琴1,吴  偶2,王  磊1,张  音1,杨  帅1   

  1. 1.军事医学科学院 卫生勤务与医学情报研究所,北京 100850
    2.天津大学 应用数学中心,天津 300072
  • 出版日期:2018-04-15 发布日期:2018-05-02

Research on deep semantic features for disease-association relation extraction

KANG Xuqin1, WU Ou2, WANG Lei1, ZHANG Yin1, YANG Shuai1   

  1. 1.Institute of Health Service and Medical Information, Academy of Military Medical Sciences, Beijing 100850, China
    2.Center for Applied Mathematics & School of Mathematics, Tianjin University, Tianjin 300072, China
  • Online:2018-04-15 Published:2018-05-02

摘要:

从大量生物医学文献中找出影响疾病的有利因素和有害因素对于疾病的防治研究方向有着重要参考意义。然而,识别疾病影响因素的二分类问题在用传统的机器学习方法进行分类时正确率提升到一定水平后遇到瓶颈难以继续提高。为了提高生物医学领域二分类问题模型的分类性能,利用对于疾病有利和有害的两种因素,采用基于卷积神经网络与支持向量机(SVM)相结合的方法,最终达到超过传统机器学习的性能,使分类的准确率从SVM最佳的90.44%提升到94.38%,从而更好地识别疾病的影响因素。

关键词: 关联抽取, 分类问题, 深度学习, 机器学习, 卷积神经网络, 支持向量机

Abstract: It is of great reference value for the research of prevention and cure of disease to find out the beneficial or harmful factors affecting the disease from a large number of biomedical literatures. However, it is difficult to identify the bottlenecks that are difficult to continue to improve when the accuracy of the classification is improved to a certain level by using the traditional machine learning method. In order to improve the performance of the classification task in biomedical field, the hybrid method of convolution neural network method and Support Vector Machine(SVM) is used by data with the two factors which are beneficial and harmful to disease. Ultimately, the method achieves better performance than the traditional machine learning, with the accuracy of classification from SVM increasing from 90.44% to 94.38%, so as to better identify the factors that affect the disease.

Key words: relation extraction, classification problem, deep learning, machine learning, convolutional neural network, Support Vector Machine(SVM)