计算机工程与应用 ›› 2021, Vol. 57 ›› Issue (6): 207-211.DOI: 10.3778/j.issn.1002-8331.2004-0146

• 工程与应用 • 上一篇    下一篇

基于BILSTM-CRF的高校政策语义角色标注研究

徐建国,刘泳慧,刘梦凡   

  1. 山东科技大学 计算机科学与工程学院,山东 青岛 266590
  • 出版日期:2021-03-15 发布日期:2021-03-12

Research on Semantic Role Labeling of University Policy Based on BILSTM-CRF

XU Jianguo, LIU Yonghui, LIU Mengfan   

  1. College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, Shandong 266590, China
  • Online:2021-03-15 Published:2021-03-12

摘要:

采用融合自注意力机制的双向长短期记忆模型(SelfAtt-BILSTM)和条件随机场模型(CRF),构建一种SelfAtt-BILSTM-CRF模型,对政策文本进行语义角色标注,以提取政策主要内容。采用某高校政策文件为实验数据集,利用BILSTM模型自动学习序列化语句上下文特征,融合自注意力机制增加重要特征元素的权重,通过CRF层利用特征进行序列标注,提取语义角色,以实现政策文件的主要内容挖掘。经过对比验证,该模型能够有效地提取政策文本内容,在标注数据集上F1值达到78.99%。实验结果同时表明,自注意力机制能够有效提高神经网络模型的语义角色标注效果。

关键词: 双向长短期记忆网络, 条件随机场, 自注意力机制, 语义角色标注, 深度学习

Abstract:

A SelfAtt-BILSTM-CRF model is constructed by combining the Self Attention Mechanism with Bidirectional Long Short-Term Memory(SelfAtt-BILSTM) and Conditional Random Field(CRF). The semantic role of policy text is annotated to extract the main content of policy. This paper takes a college policy file as the experimental data set, uses the BILSTM model to automatically learn the context features of serialized statements, integrates the self attention mechanism to increase the weight of important feature elements, and finally uses the features to sequence annotation through CRF layer to extract the semantic roles, so as to realize the main content mining of policy files. By contrast, the SelfAtt-BILSTM-CRF model can effectively extract the content of policy text, and the F1 value in the annotation data set reaches 78.99%. The experimental results also show that the Self Attention Mechanism can effectively improve the semantic role annotation effect of the neural network model.

Key words: bidirectional long short-term memory, conditional random field, self attention mechanism, semantic role labeling, deep learning