Computer Engineering and Applications ›› 2021, Vol. 57 ›› Issue (18): 255-262.DOI: 10.3778/j.issn.1002-8331.2005-0412

Previous Articles     Next Articles

Frame Disambiguation of FrameNet Based on SVM and CRF Two-Stage Model

QIN Boyu, HAO Xiaoyan, LIU Yongfang   

  1. College of Information and Computer, Taiyuan University of Technology, Taiyuan 030600, China
  • Online:2021-09-15 Published:2021-09-13

基于SVM和CRF双层模型的FrameNet框架消歧

秦博宇,郝晓燕,刘永芳   

  1. 太原理工大学 信息与计算机学院,太原 030600

Abstract:

Frame disambiguation refers to the frame that automatically identifies the ambiguous target word according to the context of the target word in a given sentence. Aiming at the problem that the traditional FrameNet frame disambiguation method does not consider the relationship between target words when it adopts the single classification model, which makes it difficult to extract the hidden features, and the classification results are relatively dependent on the performance of the classification model and the setting of parameters, a frame disambiguation method based on the two-stage model is proposed. This method uses the idea of divide and conquer to transform the frame disambiguation problem into the classification and sequence annotation of target words. The SVM model as the first stage roughly classifies the input corpus and obtains the classification labelsequence. The CRF model as the second stage takes the context sequence and classification label sequence of the SVM model as the input and add classification label to the feature template for further sequence annotation. Eighteen lexical elements and 2 614 sentences in the FrameNet are selected as the corpus of the experiment. Experimental results show that, compared with traditional single model, the two-stage model based on SVM and CRF has a higher accuracy in frame disambiguation, which proves that this method is a more suitable FrameNet frame disambiguation method.

Key words: FrameNet, frame disambiguation, support vector machine, conditional random field, two-stage model

摘要:

框架消歧指的是在给定的句子中根据目标词的上下文语境,自动识别出有歧义的目标词所属的框架。针对传统FrameNet框架消歧方法使用单一分类模型时没有考虑到目标词之间的联系而导致隐性特征难以被提取,以及分类结果比较依赖分类模型的性能及参数的设置的问题,提出了一种基于SVM和CRF双层模型的FrameNet框架消歧方法。该方法利用分治思想将框架消歧问题转化为对目标词的分类及序列标注。第一层SVM模型对输入的语料进行粗分类,得到分类标签序列;第二层CRF模型将文本序列和SVM模型的分类标签序列作为输入,将分类标签加入特征模板进一步进行序列标注。实验选取了FrameNet语义知识库中能够激起多个框架的18个词元,2?614条例句作为实验数据。实验结果显示,与传统方法相比,基于SVM和CRF的双层模型有较高的准确率,证明了该方法是一种较为适用的FrameNet框架消歧方法。

关键词: FrameNet, 框架消歧, 支持向量机, 条件随机场, 双层模型