计算机工程与应用 ›› 2014, Vol. 50 ›› Issue (15): 75-81.

• 网络、通信、安全 • 上一篇    下一篇

基于Web的中文陈述句正误验证

宋  鑫,张  瑜,胡  轶   

  1. 河北大学 数学与计算机学院,河北 保定 071002
  • 出版日期:2014-08-01 发布日期:2014-08-04

Verifying truthfulness of Chinese fact statements based on Web

SONG Xin, ZHANG Yu, HU Yi   

  1. College of Mathematics & Computer Science, Hebei University, Baoding, Hebei 071002, China
  • Online:2014-08-01 Published:2014-08-04

摘要: 针对Web页中存在不少不真实信息的问题,提出了一个两步的方法来鉴别一个中文陈述句是否是事实。第一步根据陈述句中的不确定单元对陈述句进行分类扩展,找到一些和待验证陈述句主题匹配的候选陈述句。第二步把候选陈述句代入现有搜索引擎,确定出最有可能的候选。这两步过程都需要从主流的搜索引擎的搜索结果中抽取各种特性。实验结果表明,准确率可以达到85%以上。经过改进,该技术可以用来评测网页的可信度。

关键词: 陈述句, 正误, 验证, Web页面, 可信度

Abstract: The Web contains a significant amount of untruthful information. This paper proposes a two-step method that aims to determine whether a given Chinese fact statement is truthful. In the first step it classifies the given statement and extends to alternative statement which has the same topic with the given statement based on doubt unit. In the second step, it sends every alternative statement including the given statement as a query to a search engine and analyzes various features extracted from the search results returned from the search engine. The experimental results show this method can achieve a precision of about 85%. After improvement, the technique can be used to evaluate the reliability of webpage.

Key words: fact statement, truthfulness, verifying, Web page, reliability