计算机工程与应用 ›› 2014, Vol. 50 ›› Issue (3): 97-102.

• 数据库、数据挖掘、机器学习 • 上一篇    下一篇

基于HNC理论和依存句法的句子相似度计算

吴佐衍,王  宇   

  1. 大连理工大学 管理科学与工程学院,辽宁 大连 116024
  • 出版日期:2014-02-01 发布日期:2014-01-26

New measure of sentences similarity based on hierarchical network of concepts theory and dependency parsing

WU Zuoyan, WANG Yu   

  1. School of Management Science and Engineering, Dalian University of Technology, Dalian, Liaoning 116024, China
  • Online:2014-02-01 Published:2014-01-26

摘要: 句子相似度计算是自然语言处理的重要研究内容。运用自然语言处理的概念层次网络(HNC)理论和依存句法理论提出一种句子相似度的计算方法。该方法认为句子的相似度是由词语的语义相似度和句法结构相似度共同决定的,利用HNC理论词汇层面联想的概念表述体系来计算词语之间的相似度,利用依存句法理论来获取句子中词语的词语搭配和构成特征,与现有典型的句子相似度算法和人工判断进行了比较。实验结果表明,该方法能够较好地反应句子之间的语义差别,是一种可行有效的方法。

关键词: 概念层次网络, 依存句法, 句子相似度, 自然语言处理

Abstract: Sentences similarity is one of the several important tasks in natural language processing. A novel measure method based on Hierarchical Network of Concepts(HNC) theory and dependency parsing in natural language processing is put forward to compute the semantic similarity. The meaning of a sentence is made up of the meanings of its individual words and the structural way the words are combined, and semantic information is obtained from HNC theory, and the syntactic information is obtained through a deep parsing process, and it is compared with the current popular methods and subjective judgment of human. Experiment shows that the method has a good performance, which can distinguish the differences between different sentences more accurately.

Key words: Hierarchical Network of Concepts(HNC), dependency parsing, sentence similarity, natural language processing