Research on Prediction of Crime Based on Self-Supervised Learning Language Model

doi:10.3778/j.issn.1002-8331.2108-0351

Abstract

Abstract: Aiming at solving the problem of crime prediction in legal judgment prediction, in order to capture the semantic information of context in case fact description more efficiently, this paper proposes a Chinese accusation prediction model, ALBT, which combines ALBERT（A Lite BERT） and convolutional neural network（TextCNN）. Firstly, the model transforms the fact description of legal text into vector representation by using ALBERT model, the key features in fact description are extracted. Then, the extracted features are fed into the convolutional neural network TextCNN model for classification and prediction. Finally, the crime prediction in the fact description is completed. The accuracy of the experiment is 88.1% on the data set of 2018 “China Law Research Cup” judicial artificial intelligence challenge. The experimental results show that the model can achieve better prediction effect in Chinese accusation prediction.

Key words: ALBERT, TextCNN, feature extraction, text categorization, crime prediction

摘要： 针对解决法律判决预测中的罪名预测问题，为了更高效地捕捉案件事实描述中上下文的语义信息，提出了一种结合ALBERT（A Lite BERT）和卷积神经网络CNN（TextCNN）的中文罪名预测模型ALBT。模型利用ALBERT模型将法律文本的事实描述转化成向量表示，提取事实描述中的关键特征，把提取到的特征送入卷积神经网络TextCNN模型中进行分类预测，最终完成对事实描述中的罪名预测。实验在2018“中国法研杯”司法人工智能挑战赛构建的数据集上精度达到了88.1%。实验结果表明，模型在中文罪名预测上能够达到更好的预测效果。

关键词: ALBERT, TextCNN, 特征提取, 文本分类, 罪名预测

TIAN Jiewen, YANG Liang, ZHANG Li, MAO Guoqing, LIN Hongfei. Research on Prediction of Crime Based on Self-Supervised Learning Language Model[J]. Computer Engineering and Applications, 2023, 59(3): 276-281.

田杰文, 杨亮, 张琍, 毛国庆, 林鸿飞. 基于自监督学习语言模型的罪名预测研究[J]. 计算机工程与应用, 2023, 59(3): 276-281.

References

[1] 刘宗林，张梅山，甄冉冉，等.融入罪名关键词的法律判决预测多任务学习模型[J].清华大学学报（自然科学版），2019，59（7）：497-504.
LIU Z L，ZHANG M S，ZHEN R R，et al.Multi task learning model for legal judgment prediction with crime keywords[J].Journal of Tsinghua University（Natural Science Edition），2019，59（7）：497-504.
[2] 王文广，陈运文，蔡华，等.基于混合深度神经网络模型的司法文书智能化处理[J].清华大学学报（自然科学版），2019，59（7）：505-511.
WANG W G，CHEN Y W，CAI H，et al.Intelligent processing of judicial documents based on hybrid deep neural network model[J].Journal of Tsinghua University（Natural Science Edition），2019，59（7）：505-511.
[3] HACHEY B，GROVER C.Extractive summarisation of legal texts[J].Artificial Intelligence and Law，2007，14（4）：305-345.
[4] GONCALVES T，QUARESMA P.Evaluating preprocessing techniques in a text classification problem[R].S?O LEOPOLDO R S.Brasil：SBC-Sociedade Brasileira de Computa??o，2005.
[5] PALAU R M，MOENS M F.Argumentationmining：the detection，classification and structure of arguments in text[C]//The 12th International Conference on Artificial Intelligence and Law，Proceedings of the Conference，June 8-12，2009.
[6] LIU C L，CHANG C T，HO J H.Case instance generation and refinement for case-based criminal summary judgments in Chinese*[J].Journal of Information Science and Engineering，2004，20（4）：783-800.
[7] LIU C L，HSIEH C D.Exploring phrase-based classification of judicial documents for criminal charges in Chinese[C]//Proceedings of the 16th International Conference on Foundations of Intelligent Systems，2006.
[8] KATZ D M，MICHAEL I I，BLACKMAN J.A general approach for predicting the behavior of the supreme court of the united states[J].Plos One，2017，12（4）：1-18.
[9] LIN W C，KUO T T，CHANG T J，et al.Exploiting machine learning models for Chinese legal documents labeling，case classification，and sentencing prediction[J].International Journal of Computational Linguistics & Chinese Language Processing，2012，17（4）.
[10] HU Z K，XIANG L，CUN C T，et al.Few-shot charge prediction with discriminative legal attributes[C]//Proceedings of the 27th International Conference on Computational Linguistics，2018：487-498.
[11] LUO B，FENG Y，XU J，et al.Learning to predict charges for criminal cases with legal basis[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing，2017：2727-2736.
[12] ZHONG H X，GUO Z P，TU C，et al.Legal judgment prediction via topological learning[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing，2018：3540-3549.
[13] XIAO C，ZHONG H，GUO Z，et al.CAIL2018：a large-scale legal dataset for judgment prediction[J].arXiv：1807. 02478，2018.
[14] KORT，FRED.Predicting supreme court decisions mathematically：a quantitative analysis of the “right to counsel” cases[J].American Political Science Review，1957，51（1）：1-12.
[15] ULMER S S.Quantitative analysis of judicial processes：some practical and theoretical applications[J].Law and Contemporary Problems，1963，28（1）：164-184.
[16] KEOWN R.Mathematical models for legal prediction，2 computer L.J.829（1980）[J].UIC John Marshall Journal of Information Technology & Privacy Law，1980（1）：829.
[17] LONG S B，TU C C，LIU Z Y，et al.Automatic judgment prediction via legal reading comprehension[EB/OL].（2018-09-18）[2018-10-12].https//arxiv.org/abs/1809.0653.
[18] WU J，WANG X，WANG W Y.Self-supervised dialogue learning[J].arXiv：1907.00448，2019.
[19] DEVLIN J，CHANG M W，LEE K，et al.BERT：pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chpter of the Association for Computational Linguistics：Human Language Technologies.Stroudsburg，PA：Association for Computational Linguistics，2019：4171-4186.
[20] BAHDANAU D，CHO K，BENGIO Y，et al.Neural machine translation by jointly learning to align and translate[EB/OL].[2019?05?24].https：//arxiv.org/pdf/1409.
0473.pdf.
[21] CHO K，VAN MERRI?NBOER B，GULCEHRE C，et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]//Proceedings of the 2014 Conference of Empirical Methods in Natural Language Processing.Stroudsburg，PA：Association for Computational Linguistics，2014：1724-1734.
[22] BA J L，KIROS J R，HINTON G E.Layer normalization[EB/OL].[2019-05-26].https：//arxiv.org/pdf/1607.06450.pdf.
[23] HE K，ZHANG X，REN S，et al.Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition.Piscataway：IEEE，2016：770-778.
[24] LAN Z，CHEN M，GOODMAN S，et al.ALBERT：a lite BERT for self-supervised learning of language representations[EB/OL].[2020-02-13].https：//arxiv.org/pdf/1909.
11942.Pdf.
[25] KIM Y.Convolutional neural networks for sentence classification[C]//Proceedings of the 24th Conference on Empirical Methods in Natural Language Processing.Stroudsburg，PA，USA：ACL，2014：1746-1750.
[26] GRAVE A，SCHMIDHUBER J.Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J].Neural Networks，2005，18（5）：602-610.
[27] SIMONYAN K，ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv：1409. 1556，2014.
[28] VASWANI A，SHAZEER N，PAIMAR N，et al.Attention is all you need[C]//Advances in Neural Information Processing System，2017：5998-6008.
[29] YANG Z C，YANG D Y，DYER C.et al.Hierarchical attention networks for document classification[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics：Human Language Technologies.San Diego，California：ACL，2016：1480-1489.