计算机工程与应用 ›› 2010, Vol. 46 ›› Issue (26): 136-137.DOI: 10.3778/j.issn.1002-8331.2010.26.042

• 数据库、信号与信息处理 • 上一篇    下一篇

多特征融合的语句相似度计算模型

张培颖   

  1. 中国石油大学(华东) 计算机与通信工程学院,山东 东营 257061
  • 收稿日期:2009-02-23 修回日期:2009-04-03 出版日期:2010-09-11 发布日期:2010-09-11
  • 通讯作者: 张培颖

Model for sentence similarity computing based on multi-features combination

ZHANG Pei-ying   

  1. College of Computer & Communication Engineering,University of Petroleum(East China),Dongying,Shandong 257061,China
  • Received:2009-02-23 Revised:2009-04-03 Online:2010-09-11 Published:2010-09-11
  • Contact: ZHANG Pei-ying

摘要: 句子的相似度计算在自然语言处理的各个领域都占有十分重要的地位。提出了一种多特征融合的句子相似度计算模型,该计算方法把句子的词形、词序、结构、长度、距离和语义这6种特征相似度考虑进来,通过对不同的特征赋予不同的权重来调节各个特征对于句子相似度的贡献,从而使计算结果得到最优。实验结果表明,该方法与其他方法相比,描述句子的信息更加全面,在计算句子相似度方面具有较高的准确率。

关键词: 自然语言处理, 句子相似度, 多特征融合, 结构相似度, 语义相似度

Abstract: Sentence similarity computing is very important in the field of natural language processing.This paper proposes a sentence similarity computing model based on the multi-features combination,it combines the word-form,word-order,structure,length,distance and semantic of the sentences to calculate the similarity between sentences,using the weight to describe the contribution of each feature of the sentence,then gets a better experiment result.Experiment result shows that this approach can fully describe the features of the sentence,and then can get the more accurate result.

Key words: natural language processing, sentence similarity, multi-features combination, structural similarity, semantic similarity

中图分类号: