计算机工程与应用 ›› 2017, Vol. 53 ›› Issue (20): 249-253.DOI: 10.3778/j.issn.1002-8331.1708-0046

• 工程与应用 • 上一篇    下一篇

基于Solr的司法大数据检索模型研究与实现

贾  贺1,艾中良1,2,贾高峰2,刘忠麟1,2,陈伯雄2   

  1. 1.华北计算技术研究所,北京 100083
    2.中国司法大数据研究院有限公司,北京 100083
  • 出版日期:2017-10-15 发布日期:2017-10-31

Research and realization on judicial large data retrieval model

JIA He1, AI Zhongliang1,2, JIA Gaofeng2, LIU Zhonglin1,2, CHEN Boxiong2   

  1. 1.North China Institute of Computing Technology, Beijing 100083, China
    2.China Justice Big Data Institute CO., Ltd, Beijing 100083, China
  • Online:2017-10-15 Published:2017-10-31

摘要: 围绕司法领域信息要素的高维性特征以及司法信息要素间的紧耦合性特征,针对司法数据检索技术中由高维信息要素间的紧耦合性导致的检索效率问题,研究和实现了一个基于Solr的司法大数据检索模型。该模型采用Solr超级集群作为数据索引库,采用HBase集群作为数据存储库,通过引入数据与索引分离、redis缓存、动态参数调整、动态cache释放等设计思路,实现了一个高效、可靠、可扩展的司法大数据检索模型。

关键词: 司法大数据, 信息检索, Solr, HBase

Abstract: Based on the high dimensionality and the tight coupling of judicial fields and the retrieval efficiency, a judicial big data retrieval model based on Solr is studied and implemented. This model uses the Solr supercluster as the data index library, and uses the HBase cluster as the data repository. By introducing the separation of data and index, redis cache, dynamic parameter adjustment and dynamic cache release, a high efficiency, reliable and scalable judicial big data retrieval model is designed.

Key words: justice big data, information retrieval, Solr, HBase