计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (15): 133-142.DOI: 10.3778/j.issn.1002-8331.2304-0387

• 模式识别与人工智能 • 上一篇    下一篇

基于任务转化的事件抽取通用框架

李健,胡瑞娟,张克亮,刘海砚   

  1. 1.战略支援部队信息工程大学 洛阳校区,河南 洛阳 471003
    2.战略支援部队信息工程大学,郑州 450001
  • 出版日期:2024-08-01 发布日期:2024-07-30

General Framework for Event Extraction Based on Task Transformation

LI Jian, HU Ruijuan, ZHANG Keliang, LIU Haiyan   

  1. 1.Luoyang Campus, PLA Strategic Support Force Information Engineering University, Luoyang, Henan 471003, China
    2.PLA Strategic Support Force Information Engineering University, Zhengzhou 450001, China
  • Online:2024-08-01 Published:2024-07-30

摘要: 论元分散、多重事件和论元重叠是事件抽取长期面临的挑战,许多任务还存在触发词或位置标注缺失的情况。对此,提出一种基于任务转化的事件抽取通用框架,包括任务转化、关系抽取和事件预测3个模块。有触发词和无触发词的标注事件分别被转化为不同形式的关系三元组;这些三元组将与原始文本一起作为关系抽取模型的训练数据;事件预测时先从输入文本中抽取三元组,再将它们还原为目标事件。对于训练语料中位置标注缺失的情况,设计了基于最短距离的论元定位算法。该框架在ChFinAnn数据集上取得81.6%的平均F1值,在DuEE-Fin数据集上的F1值为72.04%(在线提交结果),均达到目前的SOTA水平。实验结果表明,该框架不仅可以显著提高事件抽取效果,而且具有广泛的适应能力。

关键词: 事件抽取, 通用框架, 任务转化, 论元定位

Abstract: Arguments scattering, multiple events and argument overlapping are the long-term challenges of event extraction, and the absence of trigger-word labels or position labels widely exists in many event extraction tasks. To address above problems, a general framework for event extraction based on task transformation is proposed, which mainly includes three modules:task transformation, relationship extraction, and event prediction. The labeled events with and without trigger-words are transformed into different forms of relational triplets, respectively. These triplets, along with the original text, serve as the training data for the relationship extraction model. During event prediction, relational triples are extracted from the input text, and then they are combined into target events. In addition, an argument localization algorithm based on shortest distance is designed to address the situation of missing positional information in the training corpus. The framework has achieved an average F1-score of 81.6% on the ChFinAnn dataset and an F1-score of 72.04% on the DuEE-Fin dataset (online submission), both reaching the current SOTA levels. The experimental results show that this framework not only significantly improves the performance of event extraction, but also has a wide range of adaptability.

Key words: event extraction, general framework, task transformation, argument positioning