Computer Engineering and Applications ›› 2023, Vol. 59 ›› Issue (2): 153-160.DOI: 10.3778/j.issn.1002-8331.2107-0157

• Pattern Recognition and Artificial Intelligence • Previous Articles     Next Articles

Chinese Event Extraction Using Question Answering

LIU Zeyi, YU Wenhua, HONG Zhiyong, KE Guanzhou, TAN Rongjie   

  1. Faculty of Intelligent Manufacturing, Wuyi University, Jiangmen, Guangdong 529020, China
  • Online:2023-01-15 Published:2023-01-15

基于问题回答模式的中文事件抽取

刘泽旖,余文华,洪智勇,柯冠舟,谭荣杰   

  1. 五邑大学 智能制造学部,广东 江门 529020

Abstract: Event extraction is a basic task in the field of natural language processing. Event extraction in the question answering mode can solve the problem of traditional event extraction methods that cannot capture the semantic information of similar argument roles in different event types. At present, the English event extraction method proposed by related scholars in this mode is restricted by language barriers, and the question template proposed by them is not ideal for extracting Chinese texts. In order to solve this problem, a set of rules for generating question templates that conform to Chinese event extraction are designed. The BERT pre-training model is selected as the basic model for Chinese event extraction. The question answering model is applied to the Chinese event extraction task, and the ACE2005 Chinese Dataset for testing. The results show that the F1 value reaches 77.7%, 68.5%, 51.5%, and 48.0% in the evaluation indexes of Trigger Identification, Trigger Classification, Argument Identification, and Argument Classification. To a certain extent, it verifies the validity of the generated rules of the designed question template and the question answering mode of the Chinese event extraction task has good extraction performance.

Key words: event extraction, question answering, natural language processing

摘要: 事件抽取是自然语言处理领域的一项基本任务。以问题回答模式进行事件抽取可以解决传统事件抽取方法存在的无法捕捉到不同事件类型中具有相似性的参数角色的语义信息等问题。目前相关学者以该模式提出的英文事件抽取方法受语言壁垒限制,其提出的问题模板在中文文本上提取效果不理想。为解决此问题,设计了一套符合中文事件抽取的问题模板的生成规则,选择BERT预训练模型作为中文事件抽取的基础模型,将问题回答模式应用到中文事件抽取任务中,并在ACE2005中文数据集进行测试。结果显示,在触发词识别、触发词分类、论元参数识别和论元参数的评价指标上,F1值分别达到77.7%、68.5%、51.5%和48.0%,在一定程度上验证了设计的问题模板的生成规则的有效性以及将问题回答模式应用到中文事件抽取任务中具有良好的抽取性能。

关键词: 事件抽取, 问题回答, 自然语言处理