Method of text information extraction based on dependency parsing and HMM

Computer Engineering and Applications ›› 2012, Vol. 48 ›› Issue (9): 138-140.

• 数据库、信号与信息处理 • Previous Articles Next Articles

Method of text information extraction based on dependency parsing and HMM

YUAN Lu, MENG Zuqiang, XU Ke

College of Computer and Electronic Information, Guangxi University, Nanning 530004, China

Received:1900-01-01 Revised:1900-01-01 Online:2012-03-21 Published:2012-04-11

依存分析和HMM相结合的信息抽取方法

袁璐，蒙祖强，许珂

广西大学计算机与电子信息学院，南宁 530004

Abstract

Abstract: Information extraction is an important part of text information processing. The current information extraction researches mostly focus on semi-structured text. It proposes a novel text information extraction algorithm based on the combination of dependency parsing and HMM. The algorithm formulates appropriate rules based on applying dependency parsing to shallow syntactic analysis of sentences, forming the input sequence of HMM to achieve free text information extraction combining the advantage of easily building, good adaptability and high extraction accuracy of HMM. Experimental results show that the new algorithm has very good performance on recall rate, accuracy and correct rate.

Key words: information extraction, free text, Hidden Markov Model（HMM）, dependency parsing

摘要： 信息抽取是文本信息处理的一个重要环节，当前的信息抽取研究工作大多针对半结构化的文本。针对自由文本，提出一种依存分析和HMM相结合的文本信息抽取算法，该算法在运用依存分析对句子进行浅层句法分析的基础上制定相应规则，形成输入序列，结合HMM易于建立、适应性好、抽取精度较高的优势，实现自由文本的信息抽取。实验结果表明，新的算法在召回率、准确率和正确率指标上均有良好的性能，说明了算法的有效性，为文本信息的抽取提供了新思路。

关键词: 信息抽取, 自由文本, 隐马尔可夫模型, 依存分析

YUAN Lu, MENG Zuqiang, XU Ke. Method of text information extraction based on dependency parsing and HMM[J]. Computer Engineering and Applications, 2012, 48(9): 138-140.

袁璐，蒙祖强，许珂. 依存分析和HMM相结合的信息抽取方法[J]. 计算机工程与应用, 2012, 48(9): 138-140.

[1]	WEI Hao, ZHOU Ai, ZHANG Yijia, CHEN Fei, QU Wen, LU Mingyu. Review of Deep Learning-Based Biomedical Entity Relation Extraction Research [J]. Computer Engineering and Applications, 2021, 57(21): 14-23.
[2]	WANG Wentao, LI Shumei, TANG Jie, LYU Weilong. DDoS Attack Detection Method Based on Probability Graph Model and DNN [J]. Computer Engineering and Applications, 2021, 57(13): 108-115.
[3]	WU Chutian, CHEN Yongle, CHEN Junjie. Cross-Protocol Anomaly Detection Algorithm Based on HMM [J]. Computer Engineering and Applications, 2020, 56(8): 81-86.
[4]	WU Cheng, WANG Chaokun, WANG Muxian. Entity Attributes Extraction Based on Text Simplification [J]. Computer Engineering and Applications, 2020, 56(21): 115-122.
[5]	HUANG Cheng1，2, LIU Jiayong1, LIU Liang1, HE Xiang1, TANG Dianhua2. Research on extraction model of malicious domain corpus based on context semantics [J]. Computer Engineering and Applications, 2018, 54(9): 101-108.
[6]	WU Xiaoquan1，2, LI Hui1，2, CHEN Mei1，2, DAI Zhenyu1，2. DRVisSys： visualization recommendation system based on attribute correlation analysis [J]. Computer Engineering and Applications, 2018, 54(7): 251-256.
[7]	WANG Haiyong, FENG Zhaoxu, YANG Haibo, ZHANG Jindong. Research on text extraction algorithm based on structure similarity page clustering [J]. Computer Engineering and Applications, 2018, 54(11): 122-127.
[8]	PAN Li1，2, DENG Jia1, WANG Yongli1. HMM-Cluster: Trajectory clustering for discovering traffic volume overload [J]. Computer Engineering and Applications, 2018, 54(1): 77-85.
[9]	DU Boyuan1, WANG Meiqing1, CHEN Changfu2, CHEN Fei1. Tags extraction for Web information based on structure consistency and feature learning [J]. Computer Engineering and Applications, 2017, 53(7): 74-78.
[10]	ZHAO Xiaoyong, WANG Lei. Product specification auto extract method of e-commerce websites [J]. Computer Engineering and Applications, 2017, 53(24): 168-171.
[11]	GU Nannan, FENG Jun, SUN Xia, ZHAO Yan, ZHANG Lei. Chinese resume information automatic extraction and recommendation algorithm [J]. Computer Engineering and Applications, 2017, 53(18): 141-148.
[12]	GE Yongkan, YU Fengqin . Improved speech synthesis with adaptive postfilter parameters [J]. Computer Engineering and Applications, 2017, 53(1): 168-171.
[13]	SUN Hongmin, JIANG Nannan, LI Xiang. Research on biological information mining model based on document set [J]. Computer Engineering and Applications, 2016, 52(24): 102-106.
[14]	HU Yifan, HU Youbin, LI Qian, GENG Dongdong. Research on face detection, tracking and recognition system based on video surveillance [J]. Computer Engineering and Applications, 2016, 52(21): 1-7.
[15]	JIANG Fang1，2, LI Guohe1，2，3, YUE Xiang4, WU Weijiang1，2，3, HONG Yunfeng3, LIU Zhiyuan3, CHENG Yuan3. Segmentation of Chinese word based on method of rough segment and part of speech tagging [J]. Computer Engineering and Applications, 2015, 51(6): 204-207.

Method of text information extraction based on dependency parsing and HMM

依存分析和HMM相结合的信息抽取方法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics