计算机工程与应用 ›› 2024, Vol. 60 ›› Issue (4): 113-121.DOI: 10.3778/j.issn.1002-8331.2209-0142

• 模式识别与人工智能 • 上一篇    下一篇

基于审判逻辑步骤的裁判文书摘要生成方法

余帅,宋玉梅,秦永彬,黄瑞章,陈艳平   

  1. 1. 贵州大学  公共大数据国家重点实验室,贵阳  550025
    2. 贵州大学  计算机科学与技术学院,贵阳  550025
  • 出版日期:2024-02-15 发布日期:2024-02-15

Method for Generating Summary of Judgment Documents Based on Trial Logic Steps

YU Shuai, SONG Yumei, QIN Yongbin, HUANG Ruizhang, CHEN Yanping   

  1. 1. State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China
    2. College of Computer Science and Technology, Guizhou University, Guiyang 550025, China
  • Online:2024-02-15 Published:2024-02-15

摘要: 面向裁判文书的司法摘要是提升裁判文书分析能力的关键技术。裁判文书作为审判活动的载体,精准地呈现了案件的审判逻辑,但目前针对裁判文书的摘要方法只关注裁判文书的序列化信息,忽视了裁判文书的逻辑结构,且不能有效解决文本过长、信息冗余等问题。提出基于审判逻辑步骤的裁判文书摘要生成方法,采取“抽取+生成”相结合的方式,在抽取部分利用多标签分类方法,依据人民法院审理案件的逻辑步骤抽取出“类型、诉请、事实、结果”四个句子集合,在生成部分由微调后的T5-PEGASUS模型得到摘要。利用基于内部知识的最大相似度匹配算法对“事实”部分的输入文本进行降噪处理,进一步改善了摘要效果。实验结果表明,相比于主流的指针生成网络模型,该方法在ROUGE-1、ROUGE-2和ROUGE-L的F1指标上分别提升了17.99个百分点、21.24个百分点、21.86个百分点,说明在司法摘要任务中引入逻辑结构能够提升性能。

关键词: 裁判文书, 审判逻辑步骤, 多标签分类, 内部知识, 生成式摘要

Abstract: Judicial summary oriented to judgment documents is the key technology to improve the analytical ability of judgment documents. As the carrier of the trial activities, the judgment documents accurately present the trial logic of the case. However, the current abstract methods only focus on the serialization information of the judgment documents, ignore the logical structure, and can not effectively solve the problems of too long texts and redundant information. A judgment document summary generation method based on the trial logic steps is proposed. The method of “extraction + generation” is adopted. The extraction part uses the multi-label classification method to extract four sentence sets of “type, claim, fact and result” according to the logic steps of the people's court. The generation part gets the summary from the fine-tuned T5-PEGASUS model. And the input text of the “fact” part is denoised by using the maximum similarity matching algorithm based on internal knowledge, which further improves the summary effect. The experimental results show that, compared with the mainstream pointer-generated network summary model, the proposed method improves the F1 index of ROUGE-1, ROUGE-2 and ROUGE-L by 17.99 percentage points, 21.24 percentage points and 21.86 percentage points, respectively. This shows that introducing logical structure into the judicial summarization can improve the performance of the task.

Key words: judgment document, trial logic steps, multi-label classification, internal knowledge, abstractive summary