Computer Engineering and Applications ›› 2016, Vol. 52 ›› Issue (10): 135-140.

Previous Articles     Next Articles

Deep web data source selection for entity association retrieval in the area of medicine

DENG Song1,2, CHEN Hui1   

  1. 1.School of Software & Communication Engineering, Jiangxi University of Finance and Economics, Nanchang 330013, China
    2.Jiangxi Key Laboratory of Data and Knowledge Engineering, Jiangxi University of Finance and Economics, Nanchang 330013, China
  • Online:2016-05-15 Published:2016-05-16

面向医学领域实体关联检索的深网数据源选择

邓  松1,2,陈  辉1   

  1. 1.江西财经大学 软件与通信工程学院,南昌 330013
    2.江西财经大学 数据与知识工程江西省高校重点实验室,南昌 330013

Abstract: There is lots of deep web in each field, if people retrieve all deep web in an area to obtain the required information, the workload is very huge. For the above reason, the data source selection technology is introduced. There are rich relationships among entities in the area of medicine, effectively integrate the entity association can promote the people’s health. In order to enhance the efficiency of information integration for entity association, proposing a data source selection method based on the characteristics of the associated entities. Firstly, construct a matrix summary of entity association based on weight and link information; Secondly, propose a correlation calculation method of data source based on the query intent. A number of experiments based on field data collection are conducted, the result show that our method's accuracy and recall are higher. So, it can provide a effective support to the entity integration in medical field.

Key words: data source selection, summary, biomedical literature, entity association

摘要: 每个领域下的深网数据源众多,如果检索领域内所有深网以获取所需的集成信息,那么工作量将十分巨大,因而数据源选择技术应运而生。医学领域实体间存在着丰富的关联关系,把相关关联信息进行有效集成可以促进人们健康生活。为提升医学领域实体关联的信息集成效率,提出了一种基于实体关联特征的数据源选择方法。基于实体关联图中的实体权重以及链接信息,构建了实体关联矩阵摘要;基于实体关联查询意图提出了数据源相关性计算方法。利用领域数据集进行了大量的实验,结果表明所提出方法准确率和召回率较高,可以为医学领域信息集成提供有效支撑。

关键词: 数据源选择, 摘要, 医学, 实体关联