计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (6): 157-163.DOI: 10.3778/j.issn.1002-8331.2009-0494

• 模式识别与人工智能 • 上一篇    下一篇

基于GLSTM和Attention的中文事件要素提取

曹渝昆,孙涛   

  1. 上海电力大学 计算机科学与技术学院,上海 200090
  • 出版日期:2022-03-15 发布日期:2022-03-15

Chinese Event Argument Extraction Based on GLSTM and Attention

CAO Yukun, SUN Tao   

  1. School of Computer Science & Technology, Shanghai University of Electric Power, Shanghai 200090, China
  • Online:2022-03-15 Published:2022-03-15

摘要: 事件信息抽取是信息抽取任务中的一种,旨在识别并提出一个事件的触发词和元素。由于容易受到数据稀疏的影响,事件要素的抽取是中文事件抽取任务中的一个难点,研究的重点在于特征工程的构建。中文语法相较英文要复杂许多,所以捕获英文文本特征的方法在中文任务中效果并不明显,而目前常用的神经网络模型仅考虑了上下文信息,不能兼顾词法和句法特征。因此针对中文的词法和句法特点,构建一种结合分组长短期记忆网络(grouped long-short term memory,GLSTM)和Attention的中文事件要素抽取方法AGCEE(attention and GLSTM based Chinese event extraction),通过Attention机制融合词特征和句子特征,采用GLSTM捕获句子的上下文信息,并通过条件随机场(conditional random fields,CRF)进行事件信息抽取,最后在公开数据集上进行实验以验证模型的有效性。

关键词: 事件要素抽取, 注意力机制, 融合特征, 分组长短期记忆网络(GLSTM)

Abstract: Event extraction is one of the tasks in information extraction(IE), which aims to detect triggers and arguments of different events. As the task is susceptible to the sparsity of data, argument extraction is a nodus in Chinese event extraction, and the focus of research is on the construction of features. As Chinese grammar is much more complicated than English, the method of capturing English text features is not effective in Chinese tasks. However, the commonly used neural network models only consider contextual information. Considering lexical and syntactic features of Chinese text, the review proposes an attention and grouped long-short term memory(GLSTM) based Chinese event extraction(AGCEE) method. The model composites lexical features and contextual features by attention, captures context information by GLSTM, and extracts the event arguments by conditional random fields(CRF). Finally, experiments and analysis are conducted on a public data set to verify the effectiveness of the method.

Key words: event argument extraction, attention mechanism, fusion features, grouped long-short term memory(GLSTM)