Research on entity extraction method of Deep Web data integration

Computer Engineering and Applications ›› 2012, Vol. 48 ›› Issue (36): 160-163.

Previous Articles Next Articles

Research on entity extraction method of Deep Web data integration

ZHAO Haixia1, LI Daoshen1, LIU Yong1, ZHAO Jiacheng2

1.Electronic & Information Engineering College, Henan University of Science and Technology, Luoyang, Henan 471003, China
2.Software College, Changchun University of Science Technology, Changchun 130000, China

Online:2012-12-21 Published:2012-12-21

一种Deep Web查询结果的实体抽取方法

赵海霞1，李道申1，刘勇1，赵嘉诚2

1.河南科技大学电子信息工程学院，河南洛阳 471003
2.长春理工大学软件学院，长春 130000

Abstract

Abstract: Based on the realization of Deep Web integrated query mechanism, Deep Web information can be obtained from the resulting pages, so how to extract the entity information of Deep Web from the results pages effectively becomes the key of Deep Web data integration. A method that combines the index with the edit similarity methods is proposed, which resolves the problem of data extraction of Deep Web result page. Large experimental results show that this approach is feasible, and can improve the precision and recall of Deep Web data extraction.

Key words: Deep Web, data extraction, Document Object Model（DOM） tree, index, similarity

摘要： Deep Web中蕴含着丰富的高质量的信息，通过Deep Web集成查询接口可以获取到包含这些信息的结果页面，因此，Deep Web查询结果页面的数据抽取成为Deep Web数据集成的关键。提出了将索引方法和编辑相似度相结合的方法，来完成Deep Web查询结果页面的数据抽取工作。大量实验结果表明:该方法是可行的，并且能够提高Deep Web数据实体抽取的准确性和召回率。

关键词: 深度网, 数据抽取, 文件对象模型（DOM）树, 索引, 相似度

ZHAO Haixia1, LI Daoshen1, LIU Yong1, ZHAO Jiacheng2. Research on entity extraction method of Deep Web data integration[J]. Computer Engineering and Applications, 2012, 48(36): 160-163.

赵海霞1，李道申1，刘勇1，赵嘉诚2. 一种Deep Web查询结果的实体抽取方法[J]. 计算机工程与应用, 2012, 48(36): 160-163.

[1]	ZHANG Qishan, CHEN Lulu. Slope One Algorithm Based on Grey Correlational Analysis by Method of Degree of Balance and Approach [J]. Computer Engineering and Applications, 2021, 57(9): 96-102.
[2]	WANG Yonggui, LI Qianyu. Hybrid Collaborative Filtering Recommendation Algorithm Based on KNN-GBDT [J]. Computer Engineering and Applications, 2021, 57(9): 103-108.
[3]	ZHANG Xiaowen, REN Yongfeng. Image Matching Algorithm Combining Sparse Representation and Topological Similarity [J]. Computer Engineering and Applications, 2021, 57(8): 198-203.
[4]	GAO Chengcheng, CHEN Xicheng, ZHANG Rui, SONG Qiuyue, YI Dong, WU Yazhou. Application of Three New Intelligent Algorithms in Epidemic Early Warning Model—COVID-19 Epidemic Warning Based on Baidu Search Index [J]. Computer Engineering and Applications, 2021, 57(8): 256-263.
[5]	ZHANG Songcan, PU Jiexin, SI Yanna, SUN Lifan. Adaptive Improved Ant Colony Algorithm Based on Population Similarity and Its Application [J]. Computer Engineering and Applications, 2021, 57(8): 70-77.
[6]	YANG Fang, YIN Xi, SI Jianhui, LIU Hongyuan, WANG Xue. Mathematical Expression Similarity Calculation Method Based on Focus Clustering [J]. Computer Engineering and Applications, 2021, 57(6): 88-93.
[7]	QIAN Yunyun, YANG Wenzhong, YAO Miao, LI Hailei, CHAI Yachuang. Topic Community Discovery Model Incorporating Topic Similarity Weight [J]. Computer Engineering and Applications, 2021, 57(5): 107-114.
[8]	FENG Jiexu, SI Guannan, ZHOU Fengyu. Research on Quality of Service Index System of Cloud Robotic Platform [J]. Computer Engineering and Applications, 2021, 57(3): 58-71.
[9]	YAO Yuan, ZHANG Zhaoyang. Stock Index Price Forecasting Method Based on HP Filter [J]. Computer Engineering and Applications, 2021, 57(24): 296-304.
[10]	JIANG Bin, LIANG Xiao’an, ZHANG Liang, GAO Yangjun. Evidence Combination Method Based on Improved Modified Weight [J]. Computer Engineering and Applications, 2021, 57(24): 100-106.
[11]	CHEN Junfeng, ZHENG Zhongtuan. Over-Sampling Method on Imbalanced Data Based on WKMeans and SMOTE [J]. Computer Engineering and Applications, 2021, 57(23): 106-112.
[12]	TIAN Wei’an, CHEN Hongmei, ZHOU Lihua. Diversified Recommendation Method Based on Similar Users’Curiosity [J]. Computer Engineering and Applications, 2021, 57(23): 113-121.
[13]	MENG Xiangfu, WANG Dandan, ZHANG Feng. Overview of Spatial Keyword Queries [J]. Computer Engineering and Applications, 2021, 57(20): 13-24.
[14]	LIANG Tian, CAO Dexin. Improved and Simplified Particle Swarm Optimization Algorithm Based on Levy Flight [J]. Computer Engineering and Applications, 2021, 57(20): 188-196.
[15]	WEI Dingfeng, LI Liang, CHAI Jing. Social Recommendation Algorithm by Fusing Item Information [J]. Computer Engineering and Applications, 2021, 57(19): 198-204.

Research on entity extraction method of Deep Web data integration

一种Deep Web查询结果的实体抽取方法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics