Application of natural language processing in cyber crime analysis

LI Jing1, LUO Wenhua2, LIN Hongfei1   

  1. 1.School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
    2.Department of Computer Criminal Detection, China Criminal Police University, Shenyang 110854, China
李 静1,罗文华2,林鸿飞1   

  1. 1.大连理工大学 计算机科学与技术学院,辽宁 大连 116024
    2.中国刑警学院 计算机犯罪侦查系,沈阳 110854

Abstract: The rapid development of Internet technology brings more and more criminal activities on the Internet, it can not only provide lots of clues, but also bring great challenges to the case handlers. This paper designs and implements a cybercrime analyzing system which uses natural language processing technology to recognize the entities such as network names. According to different types of files, two different methods are adopted. One is structural analysis-based entity recognition technology, another is combination of rules and statistics-based, user-corpus-assisted entity recognition technology. Applied to the cybercrime analyzing system, it is found that it can assist case handlers to track down the criminals.

Key words: natural language processing, named entity recognition, cybercrime case analysis

摘要: 随着互联网技术的飞速发展,大量的网络案情信息存在于互联网上,这既给办案人员提供了一定的线索,同时又带来了很大的挑战。设计并实现了一种网络案情分析系统,利用自然语言处理技术识别出海量网络案情文件中网名和网址等信息,并构建它们之间的关系网络。针对不同类型的文件,分别采取结构化分析和以“规则和统计”相结合为主、用户辅助知识库为辅的网名识别技术。实验证明,将该方法应用于网络犯罪案情分析系统中,有助于办案人员快速侦破案情。

关键词: 自然语言处理, 命名实体识别, 网络案情分析