计算机工程与应用 ›› 2020, Vol. 56 ›› Issue (24): 274-278.DOI: 10.3778/j.issn.1002-8331.1910-0084

• 工程与应用 • 上一篇    

面向法律文书的中文文本校对方法研究

刘明洁,梁毅,艾中良,贾高峰   

  1. 1.北京工业大学 计算机学院,北京 100124
    2.中国司法大数据研究院有限公司,北京 100043
  • 出版日期:2020-12-15 发布日期:2020-12-15

Research on Chinese Text Proofreading Method for Legal Documents

LIU Mingjie, LIANG Yi, AI Zhongliang, JIA Gaofeng   

  1. 1.School of Computer, Beijing University of Technology, Beijing 100124, China
    2.China Judicial Big Data Research Institute, Beijing 100043, China
  • Online:2020-12-15 Published:2020-12-15

摘要:

在研究法律文书书写错误的语言表述特征后,将法律文书中的文本错误分为叙事陈述时的直接错误和行文书写时的隐含错误,并构建一组正则匹配规则和字词识别规则来进行错字错词识别。通过对法律文书语言学特征的研究,提出一种规则与概率统计相结合的方法实现对法律文书的文本校对。实验结果显示,该方法的召回率和准确率均达到80%,具有较好的使用前景。

关键词: 法律文书, 文本校对, 正则匹配, 纠错模型

Abstract:

On the basis of studying the linguistic characteristics of writing errors in legal documents, the text errors in legal documents are divided into two types:direct errors in narrative statements and implicit errors in line documents. A set of rules of regular matching and word recognition are constructed to recognize erroneous words. By studying the linguistic characteristics of legal documents, this paper proposes a method of combining rules with probability statistics to realize text proofreading of legal documents. The experimental results show that the recall rate and accuracy of the method are both up to 80%, and the method has good application prospects.

Key words: legal document, text proofreading, regular expression matching, error detecting model