计算机工程与应用 ›› 2021, Vol. 57 ›› Issue (20): 104-108.DOI: 10.3778/j.issn.1002-8331.2010-0061

• 理论与研发 • 上一篇    下一篇

N1+N2结构语法关系判定的SVM算法

杨泉   

  1. 北京师范大学 汉语文化学院,北京 100875
  • 出版日期:2021-10-15 发布日期:2021-10-21

SVM Algorithm for N1+N2 Structure Syntax Relation Determination

YANG Quan   

  1. College of Chinese Language and Culture, Beijing Normal University, Beijing 100875, China
  • Online:2021-10-15 Published:2021-10-21

摘要:

短语结构的语法关系判定是自然语言处理领域的关键问题之一,应用支持向量机进行分类判定,其核心问题是如何将汉语短语结构转换为适合支持向量机使用的数值向量的形式。在自建N1+N2结构语料库的基础上,利用《同义词词林》对N1+N2结构内部两个名词进行语义编码,并将编码转换为数值向量,运用支持向量机的方法判定该结构的语法关系,按照训练集与测试集9∶1的比例使用随机交叉验证的方法进行检验,平均正确率达到86.2%。实验结果证明了所提算法的有效性,也证明了运用人工智能方法处理自然语言处理领域的问题势在必行。

关键词: 自然语言处理, 人工智能, 支持向量机, 短语层级, 语法关系, 知识本体

Abstract:

Determining the grammatical relationship of phrase structures is a key problem in the field of Natural Language Processing(NLP). In order to apply Support Vector Machine(SVM) to classify and judge phrase structures, it needs to transform Chinese phrase structures into numerical vectors. On the basis of the self-built N1+N2 structure corpus, the semantic coding of two nouns in N1+N2 structure is carried out by using Cilin, and the coding is converted into numerical vector. Then, support vector machine is used to determine the grammatical relationship of the structure. Finally, random cross validation method is used to test the structure according to the ratio of training set to test set of 9∶1, and the average accuracy is 86.2%. The experimental results show the effectiveness of the proposed algorithm and the necessity of using artificial intelligence to deal with problems in the field of natural language processing.

Key words: natural language processing, artificial intelligence, support vector machine, phrase level, grammatical relation, ontology