计算机工程与应用 ›› 2009, Vol. 45 ›› Issue (16): 134-137.DOI: 10.3778/j.issn.1002-8331.2009.16.039

• 数据库、信息处理 • 上一篇    下一篇

大规模语料库可用性评测方法

李艳红1,2,郑家恒1,2   

  1. 1.山西大学 计算机与信息技术学院,太原 030006
    2.计算智能与中文信息处理教育部重点实验室,太原 030006
  • 收稿日期:2008-04-21 修回日期:2008-07-11 出版日期:2009-06-01 发布日期:2009-06-01
  • 通讯作者: 李艳红

Method for evaluating usability of large-scale corpus

LI Yan-hong1,2,ZHENG Jia-heng1,2   

  1. 1.Department of Computer & Information Technology,Shanxi University,Taiyuan 030006,China
    2.Key Laboratory of Ministry of Education for Computation Intelligence and Chinese Information Processing,Taiyuan 030006,China
  • Received:2008-04-21 Revised:2008-07-11 Online:2009-06-01 Published:2009-06-01
  • Contact: LI Yan-hong

摘要: 提出了一种大规模语料库可用性评测方法。通过分析语料库工程的生命周期,构建了大规模语料库可用性评测指标体系,运用层次分析-模糊综合评价方法实现了语料库的可用性评测,给出了语料库的可用性级别。分析语料库的评测结果,确定影响语料库可用性的瓶颈因素,进而提出针对性的改进措施。最后,举例说明了该方法在语料库上的应用。

关键词: 大规模语料库, 可用性评价, 层次分析法, 模糊综合评价

Abstract: A quantitative technique for evaluating the usability of large-scale corpus was developed.By analyzing the lifecycle of corpus engineering,a usability of large-scale corpus evaluation architecture was designed.Then analytic hierarchy processes-comprehensive fuzzy evaluation theory was used to quantitatively calculate the corpus usability,and the level of corpus usability was presented.According to the evaluated usability,the bottleneck factors were found out and corresponding effective measures were provided.Finally,the proposed technique was illustrated by a case study of corpus.

Key words: large-scale corpus, usability evaluation, analytic hierarchy processes, comprehensive fuzzy evaluation