Computer Engineering and Applications ›› 2012, Vol. 48 ›› Issue (26): 243-248.

Previous Articles    

Application of hidden Markov model in information extraction of environmental protection files

PAN Peng1,2, ZHU Yunqiang2, ZHU Qi3, ZHAO Xiaohong4   

  1. 1.School of Resources and Environment Science, Wuhan University, Wuhan 430079, China
    2.Key Lab of Resources and Environmental Information System, Institute of Geographical Sciences and Resources Research, Chinese Academy of Sciences, Beijing 100101, China
    3.Information Center, Ministry of Environmental Protection of the PCR, Beijing 100029, China
    4.Environmental Engineering Assessment Center, Ministry of Environmental Protection of the PCR, Beijing 100012, China
  • Online:2012-09-11 Published:2012-09-21

隐马尔可夫模型在环保档案信息抽取中的应用

潘  鹏1,2,诸云强2,朱  琦3,赵晓宏4   

  1. 1.武汉大学 资源与环境科学学院,武汉 430079
    2.中国科学院 地理科学与资源研究所 资源与环境信息系统国家重点实验室,北京 100101
    3.环境保护部 信息中心,北京 100029
    4.环境保护部 环境工程评估中心,北京 100012

Abstract: Facing more and more serious environmental problems, effective information extraction methods urgently need to extract useful information from environmental protection files for supporting macro decision-making. This paper takes environmental impact reports for example to research how to use the Hidden Markov Model(HMM) to extract environmental impact assessment information of construction projects. Elements and applications of HMM are clarified. The foundational thinking that how to use the HMM to extract information from the text of environmental assessment reports is given after analyzing the characteristics of environmental assessment reports. The methods and specific steps that how to build and apply the HMM are given. The method is verified by example. It comes to the conclusion that using the HMM to extract environmental protection information can get high recall and precision.

Key words: Hidden Markov Model(HMM), environmental protection file, environmental impact report, Information Extraction(IE), decision support

摘要: 面对突出的环境问题,亟需有效的方法从环境保护档案中抽取有用的信息用于支持环境保护等宏观决策。以建设项目环境影响报告书为例,研究如何利用隐马尔可夫模型来抽取建设项目的环境影响评价信息。阐明隐马尔可夫模型的原理与应用情况,分析报告书特点并明确应用模型进行报告书文本信息抽取的基本思想,并给出模型建立和应用的方法及具体步骤。通过实例验证得出,利用隐马尔可夫模型抽取环境保护信息能够获得较高的召回率和精确度,整体效果较好。

关键词: 隐马尔可夫模型, 环境保护档案, 环境影响报告书, 信息抽取, 决策支持