计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (2): 153-160.DOI: 10.3778/j.issn.1002-8331.2007-0407

• 模式识别与人工智能 • 上一篇    下一篇

结合网络表示学习和文本卷积网络的类案发现

梁鸿翔,张步烨,李炜卓,程茜雅   

  1. 1.中国航天科工集团第二研究院,北京 100854
    2.东南大学 网络空间安全学院,南京 211189
    3.东南大学 计算机科学与工程学院,南京 211189
  • 出版日期:2022-01-15 发布日期:2022-01-18

Combining Network Representation Learning and Text Convolutional Neural Network for Similar Case Discovery

LIANG Hongxiang, ZHANG Buye, LI Weizhuo, CHENG Xiya   

  1. 1.The Second Academy of China Aerospace Science and Industry Corporation, Beijing 100854, China
    2.School of Cyber Science and Engineering, Southeast University, Nanjing 211189, China
    3.School of Computer Science and Engineering, Southeast University, Nanjing 211189, China
  • Online:2022-01-15 Published:2022-01-18

摘要: 作为“智慧法院”的核心应用之一,相似裁判文书的发现有助于解决司法过程中裁判尺度不统一、类案不同、量刑不规范等问题。目前,一部分方法侧重于从裁判文书中总结领域特征,并将这些特征融入到语言模型中来提升相似文书发现的效果。另一部分工作将其转化为分类任务,利用有监督学习模型来进行建模与预测。然而,已有的方法没有考虑将语言模型与分类模型各自的优势进行结合。为此,提出一种基于网络表示学习(network representation learning)和文本卷积网络(convolutional neural network for texts)的类案发现方法。方法分别从无监督学习与有监督学习的视角来建模裁判文书中的信息,并根据法律知识体系对原有模型的负采样方法(negative sampling)进行改进。最终,方法设计了一种较为合理的投票机制将两类模型的结果进行融合。实验结果表明,提出的联合方法较已有方法能在类案发现任务中取得更高的推送准确率。

关键词: 类案发现, 网络表示学习, 卷积神经网络, 投票机制

Abstract: As one of the core applications of smart court, the discovery of similar judgment documents can help to solve the problems of inconsistent judgment standards, different case types, and irregular sentencing in the judicial process. At present, some of the existing methods focus on summarizing field features from judicial documents into text processing methods for improving the performances of similar judgment documents discovery. Other works transform the case discovery into classification problems and employ supervised learning models to obtain the results. However, current methods do not consider combining the advantages of above models together. To fill this gap, a method based on network representation learning and convolutional neural network for texts is proposed for similar case discovery. It encodes the information of judicial documents in view of unsupervised learning and supervised learning, and improves the negative sampling strategies of original models based on legal knowledge system. Finally, the method merges the outputs of two models by the designed voting mechanism. Experimental results indicate that the proposed method can obtain better performances than existing methods in terms of accuracy rate.

Key words: similar case discover, network representation learning, convolutional neural network, voting mechanism