Computer Engineering and Applications ›› 2016, Vol. 52 ›› Issue (20): 14-19.

Previous Articles     Next Articles

Research on parallel crypt inverted index

SHU Xiaowei, YANG Geng, NA Haiyang   

  1. School of Computer Science & Technology, School of Software, Nanjing University of Posts and Telecommunications, Nanjing 210003, China
  • Online:2016-10-15 Published:2016-10-14

并行密文倒排索引研究

束晓伟,杨  庚,那海洋   

  1. 南京邮电大学 计算机学院、软件学院,南京 210003

Abstract: Encrypting data is a method to protect customer’s privacy, especially in the open system, but it becomes a problem how to do query on encrypted data. In view of some low performance of the existing SSE-1 scheme, it uses different encryption strategies to design a crypt inverted index(Crypt-Lucene) based on lucene. In addition, a scheme for building Crypt-Lucene parallelly is proposed based on MapReduce. The performance of the scheme is analyzed in the theory, and then experiments are conducted to demonstrate the efficiency of the design. The experimental results show that it can reduce 60% time to build index with Crypt-Lucene compared with SSE-1, and it also gets a good space performance. It is observed that building 8 Crypt-Lucene for large document collections with MapReduce on the Hadoop cluster consisting of four nodes can reduce 83.4% time.

Key words: searchable encryption, crypt inverted index, lucene, MapReduce, parallel index

摘要: 加密数据是保护用户隐私的一个方法,特别在开放系统中的数据处理需求更为迫切,但要解决如何在密文上进行检索的问题。针对SSE-1密文检索方案的一些性能缺陷,采用不同的加密策略,在lucene倒排索引的基础上,设计了密文倒排索引Crypt-Lucene,同时结合云计算特点,设计了并行构建Crypt-Lucene方案,理论分析了方案的性能,并通过实验证明了方法的有效性。实验结果表明,Crypt-Lucene与SSE-1相比,索引构建时间减少了约为60%,同时具有较好的空间性能,对于大文档集合,利用MapReduce在4结点构成的Hadoop集群上并行构建8个Crypt-Lucene索引能减少83.4%的时间。

关键词: 可搜索加密, 密文倒排索引, lucene, MapReduce, 并行索引