计算机工程与应用 ›› 2022, Vol. 58 ›› Issue (22): 30-40.DOI: 10.3778/j.issn.1002-8331.2112-0066
孙鉴,武晓晓,谢开斌,高锦涛,刘凇佐,巫思敏
出版日期:
2022-11-15
发布日期:
2022-11-15
SUN Jian, WU Xiaoxiao, XIE Kaibin, GAO Jintao, LIU Songzuo, WU Simin
Online:
2022-11-15
Published:
2022-11-15
摘要: 随着闪存技术的不断成熟,基于闪存的固态硬盘(solid state drive,SSD)迅速发展。然而,SSD具有不同于磁盘的特性,使得传统基于磁盘设计的索引不适用于闪存环境,因此面向闪存索引机制的研究与优化迅速开展。通过对目前闪存索引的广泛调研,从索引更新策略的角度,分析了它们的优缺点,旨在为SSD算法设计和索引开发提供系统的、有价值的参考。最后讨论了该领域未来的发展趋势和新的研究方向。
孙鉴, 武晓晓, 谢开斌, 高锦涛, 刘凇佐, 巫思敏. 面向闪存的树型索引综述[J]. 计算机工程与应用, 2022, 58(22): 30-40.
SUN Jian, WU Xiaoxiao, XIE Kaibin, GAO Jintao, LIU Songzuo, WU Simin. Review of Tree Index Based on Flash Memory[J]. Computer Engineering and Applications, 2022, 58(22): 30-40.
[1] 孙翠锋,王祎晨.闪存技术发展趋势及在数据中心的应用[J].互联网天地,2020(12):57-60. SUN C F,WANG Y C.Development trend and application of flash memory technology in data center[J].Internet World,2020(12):57-60. [2] 宋俊辉.面向高级命令的闪存转换层算法研究[D].武汉:华中科技大学,2017. SONG J H.Research on flash translation layer algorithm for advanced commands[D].Wuhan:Huazhong University of Science and Technology,2017. [3] 李红艳.针对固态盘的I/O优化技术研究[D].武汉:华中科技大学,2016. LI H Y.Research on I/O optimization technology based on solid state disk[D].Wuhan:Huazhong University of Science and Technology,2016. [4] 徐海娟.保留时间感知的闪存转换层设计与实现[D].武汉:华中科技大学,2017. XU H J.Design and implementation of retention-time aware flash translation layer[D].Wuhan:Huazhong University of Science and Technology,2017. [5] 李楚.固态盘缓存系统基于闪存特性的优化研究[D].武汉:华中科技大学,2017. LI C.Exploring inherent features of flash memory for SSD-based cache optimization[D].Wuhan:Huazhong University of Science and Technology,2017. [6] 周斯忠,陈耀武.一种应用于闪存数据库的高效B+tree索引机制[J].计算机工程,2013,39(9):1-5. ZHOU S Z,CHEN Y W.An efficient B+tree index mechanism for flash-based database[J].Computer Engineering,2013,39(9):1-5. [7] 曾思望.基于固态盘的分布式块存储系统缓存技术研究[D].武汉:华中科技大学,2015. ZENG S W.The research of SSD-based cache techniques for distributed block storage system[D].Wuhan:Huazhong University of Science and Technology,2015. [8] 罗蜜.基于数据生存期的固态盘性能优化策略研究[D].武汉:华中科技大学,2017. LUO M.Research on performance optimization for solid-state drive based on data lifetime[D].Wuhan:Huazhong University of Science and Technology,2017. [9] LEE Y G,JUNG D,KANG D,et al.μ-FTL:a memory-efficient flash translation layer supporting multiple mapping granularities[C]//Proceedings of the 8th ACM International Conference on Embedded Software,2008:21-30. [10] 崔凯.混合结构闪存索引研究[D].合肥:中国科学技术大学,2010. CUI K.Research on hybrid index structure for flash disks[D].Hefei:University of Science and Technology of China,2010. [11] 林根.基于数据类型语义的闪存转换层的设计与实现[D].武汉:华中科技大学,2017. LIN G.The design and implementation of flash translation layer based on semantic of data type[D].Wuhan:Huazhong University of Science and Technology,2017. [12] 曾祥伟.基于闪存特性的固态盘缓存算法研究[D].广州:暨南大学,2020. ZENG X W.Research on solid state disk cache algorithm based on flash memory characteristics[D].Guangzhou:Jinan University,2020. [13] GALAKATOS A,MARKOVITCH M,BINNIG C,et al.Fiting-tree:a data-aware index structure[C]//2019 International Conference on Management of Data,2019:1189-1206. [14] LEVANDOSKI J J,LOMET D B,SENGUPTA S.The BW-tree:a B-tree for new hardware platforms[C]//2013 IEEE 29th International Conference on Data Engineering,2013:302-313. [15] ROH H,KIM S,LEE D,et al.AS B-tree:a study of an efficient B-tree for SSDs[J].Journal of Information Science and Engineering,2014,30(1):85-106. [16] WU C H,CHANG L P,KUO T W.An efficient B-tree layer for flash-memory storage systems[C]//9th International Conference on Real-Time and Embedded Computer Systems and Applications.Berlin,Heidelberg:Springer,2003:409-430. [17] ROH H,PARK S,KIM S,et al.B+-tree index optimization by exploiting internal parallelism of flash-based solid state drives[J].arXiv:1201.0227,2011. [18] ROH H,PARK S,SHIN M,et al.MPSearch:Multi-path search for tree-based indexes to exploit internal parallelism of flash SSDs[J].IEEE Data Engineering Bulletin,2014,37(2):3-11. [19] 梁新伟.基于NAND闪存的数据库索引机制研究与改进[D].沈阳:东北大学,2013. LIANG X W.Research and improvement of database index based on NAND flash[D].Shenyang:Northeastern University,2013. [20] JIN P,YANG C,JENSEN C S,et al.Read/write-optimized tree indexing for solid-state drives[J].The VLDB Journal,2016,25(5):695-717. [21] YIN S,PUCHERAL P,MENG X.PBfilter:indexing flash-resident data through partitioned summaries[C]//17th ACM Conference on Information and Knowledge Management,2008:1333-1334. [22] BYUN S,HUH M,HWANG H.An index rewriting scheme using compression for flash memory database systems[J].Journal of Information Science,2007,33(4):398-415. [23] XIE B,LI Q,WANG Y,et al.EPLA-DSTree:extending piecewise linear approximation on a dynamic segmentation tree index in sensor-cloud systems[J].IEEE Access,2020,9:123742-123753. [24] ATHANASSOULIS M,AILAMAKI A.BF-tree:approximate tree indexing[C]//40th International Conference on Very Large Databases,2014. [25] 杨程程.基于闪存的索引机制研究[D].合肥:中国科学技术大学,2017. YANG C C.Research on index mechanism for flash based devices[D].Hefei:University of Science and Technology of China,2017. [26] 郝行军.物联网大数据存储与管理技术研究[D].合肥:中国科学技术大学,2017. HAO X J.Reseach on technology of storage and management of IoT data[D].Hefei:University of Science and Technology of China,2017. [27] JIN R,CHO H J,CHUNG T S.LS-LRU:a lazy-split LRU buffer replacement policy for flash-based B+-tree index[J].Journal of Information Science and Engineering,2015,31(3):1113-1132. [28] LEE H S,LEE D H.An efficient index buffer management scheme for implementing a B-tree on NAND flash memory[J].Data & Knowledge Engineering,2010,69(9):901-916. [29] LEE H S,PARK S,SONG H J,et al.An efficient buffer management scheme for implementing a B-tree on NAND flash memory[C]//3rd International Conference on Embedded Software and Systems.Berlin,Heidelberg:Springer,2007:181-192. [30] XIANG X,YUE L,LIU Z,et al.A reliable B-tree implementation over flash memory[C]//2008 ACM Symposium on Applied Computing,2008:1487-1491. [31] 刘颖杰,林子雨,赖永炫.CF-HNLBI:一种新的闪存数据库B-树索引[J].厦门大学学报(自然科学版),2015,54(2):247-256. LIU Y J,LIN Z Y,LAI Y X.CF-HNLBI:a new B-tree indexing for flash-based databases[J].Journal of Xiamen University(Natural Science),2015,54(2):247-256. [32] FEVGAS A,AKRITIDIS L,BOZANIS P,et al.Indexing in flash storage devices:a survey on challenges,current approaches,and future trends[J].The VLDB Journal,2020,29(1):273-311. [33] ON S T,HU H,LI Y,et al.Lazy-update B+-tree for flash devices[C]//2009 10th International Conference on Mobile Data Management:Systems,Services and Middleware,2009:323-328. [34] ROH H,KIM W C,KIM S,et al.A B-tree index extension to enhance response time and the life cycle of flash memory[J].Information Sciences,2009,179(18):3136-3161. [35] 周大,梁智超,孟小峰.HF-Tree:一种闪存数据库的高更新性能索引结构[J].计算机研究与发展,2010,47(5):832-840. ZHOU D,LIANG Z C,MENG X F.HF-Tree:an update-efficient index for flash memory[J].Journal of Computer Research and Development,2010,47(5):832-840. [36] 付佳.基于LSM树的NoSQL数据库索引研究[D].北京:北京理工大学,2016. FU J.Research on NoSQL database index based on LSM tree[D].Beijing:Beijing Institute of Technology,2016. [37] 马文龙,朱妤晴,蒋德钧,等.Key-Value型NoSQL本地存储系统研究[J].计算机学报,2018,41(8):1722-1751. MA W L,ZHU Y Q,JIANG D Y,et al.A survey on local key-value store of NoSQL system[J].Chinese Journal of Computers,2018,41(8):1722-1751. [38] ZHOU X,SHOU L,CHEN K,et al.DPTree:differential indexing for persistent memory[J].Proceedings of the VLDB Endowment,2019,13(4):421-434. [39] LI Y,HE B,LUO Q,et al.Tree indexing on flash disks[C]//2009 IEEE 25th International Conference on Data Engineering,2009:1303-1306. [40] LI Y,HE B,YANG R J,et al.Tree indexing on solid state drives[J].Proceedings of the VLDB Endowment,2010,3(1/2):1195-1206. [41] 马伟.基于FD-tree的闪存数据库索引技术研究[D].杭州:浙江大学,2011. MA W.The FD-tree reexamined:a case for indexing flash database[D].Hangzhou:Zhejiang University,2011. [42] CHOI W G,SHIN M,LEE D,et al.Optimization of a multiversion index on SSDs to improve system performance[C]//2016 IEEE International Conference on Systems,Man,and Cybernetics,2016:1620-1625. [43] 吴桐.基于闪存的数据库索引技术研究[D].北京:北京邮电大学,2014. WU T.Flash-based database indexing technology research[D].Beijing:Beijing University of Posts and Telecommunications,2014. [44] 房俊华,王翰虎,陈梅,等.一种具有自适应机制的闪存数据库索引结构[J].计算机应用,2013,33(2):563-566. FANG J H,WANG H H,CHEN M,et al.Index structure with self-adaptive mechanism in flash-based database system[J].Journal of Computer Applications,2013,33(2):563-566. [45] NA G J,LEE S W,MOON B.Dynamic in-page logging for B+-tree index[J].IEEE Transactions on Knowledge and Data Engineering,2011,24(7):1231-1243. [46] NATH S,KANSAL A.FlashDB:dynamic self-tuning database for NAND flash[C]//6th International Conference on Information Processing in Sensor Networks,2007:410-419. [47] AGRAWAL D,GANESAN D,SITARAMAN R,et al.Lazy-adaptive tree:an optimized index structure for flash devices[J].Proceedings of the VLDB Endowment,2009,2(1):361-372. [48] 房俊华,王翰虎,陈梅,等.DB-tree:一种高性能的闪存数据库索引结构[J].计算机应用与软件,2013,30(11):243-246. FANG J H,WANG H H,CHEN M,et al.DB-tree:an efficient index structure for flash-based database[J].Computer Applications and Software,2013,30(11):243-246. [49] 叶锋.基于闪存的混合存储系统缓存算法研究[D].武汉:华中科技大学,2015. YE F.Research of cache replacement algorithms for flash-based hybrid storage system[D].Wuhan:Huazhong University of Science and Technology,2015. [50] TANG C,DONG Z,WANG M,et al.Learned indexes for dynamic workloads[J].arXiv:1902.00655,2019. [51] ZHU Z,MUN J H,RAMAN A,et al.Reducing Bloom Filter CPU Overhead in LSM-Trees on Modern Storage Devices[C]//17th International Workshop on Data Management on New Hardware,2021:1-10. [52] THONANGI R,BABU S,YANG J.A practical concurrent index for solid-state drives[C]//21st ACM International Conference on Information and Knowledge Management,2012:1332-1341. [53] LIU J,CHEN S,WANG L.LB+trees:optimizing persistent index performance on 3DXPoint memory[J].Proceedings of the VLDB Endowment,2020,13(7):1078-1090. [54] 杨国平,乔少杰,屈露露,等.学习型数据库索引推荐技术综述[J].重庆理工大学学报(自然科学),2022,36(6):189-199. YANG G P,QIAO S J,QU L L,et al.A survey of learning based database index advisor technology[J].Journal of Chongqing University of Technology(Natural Science),2022,36(6):189-199. [55] YAN Y,YAO S,WANG H,et al.Index selection for NoSQL database with deep reinforcement learning[J].Information Sciences,2021,561:20-30. [56] 陈国栋.固态硬盘I/O性能抖动优化算法研究[D].合肥:安徽大学,2019. CHEN G D.Research on I/O performance jitter optimization algorithm for solid state disk[D].Hefei:Anhui University,2019. [57] 杨伟健.固态盘存储系统的性能优化研究[D].厦门:厦门大学,2017. YANG W J.Research on performance optimization for SSD-based storage systems[D].Xiamen:Xiamen University,2017. [58] TOSHNIWAL A,TANEJA S,SHUKLA A,et al.Storm@ twitter[C]//2014 ACM SIGMOD International Conference on Management of Data,2014:147-156. [59] CARBONE P,KATSIFODIMOS A,EWEN S,et al.Apache Flink:stream and batch processing in a single engine[J].IEEE Data Engineering Bulletin,2015,38(1):28-38. [60] 何龙,陈晋川,杜小勇.一种面向HDFS的多层索引技术[J].软件学报,2017,28(3):502-513. HE L,CHEN J C,DU X Y.Multi-layered index for HDFS-based systems[J].Journal of Software,2017,28(3):502-513. [61] 李斌,郭景维,彭骞.面向大数据存储的HBase二级索引设计[J].计算技术与自动化,2019,38(2):124-129. LI B,GUO J W,PENG Q.Design of HBase secondary indexes for big data storage[J].Computing Technology and Automation,2019,38(2):124-129. [62] KONDYLAKIS H,DAYAN N,ZOUMPATIANOS K,et al.Coconut:sortable summarizations for scalable indexes over static and streaming data series[J].The VLDB Journal,2019,28(6):847-869. [63] WANG L,CAI R,FU T Z J,et al.Waterwheel:realtime indexing and temporal range query processing over massive data streams[C]//2018 IEEE 34th International Conference on Data Engineering,2018:269-280. [64] 祝磊.NAND固态盘缓存区管理及磨损均衡算法研究[D].长沙:湖南大学,2018. ZHU L.A study on buffer management and wear leveling algorithm for NAND solid state drives[D].Changsha:Hunan University,2018. [65] 孙杰贤.华数传媒:创新,引领与傲腾[J].中国信息化,2019(1):42-43. SUN J X.Washu Media:innovation,leading and Optane[J].China Informatization,2019(1):42-43. [66] 小平.用好Intel傲腾内存[J].电脑知识与技术,2019(2):81-82. XIAO P.Make good use of Intel Optane memory[J].Computer Knowledge and Technology,2019(2):81-82. |
[1] | 贾晓千,陈刚,李白冰. 边缘计算在视频侦查中的应用[J]. 计算机工程与应用, 2020, 56(17): 86-92. |
[2] | 王 聪1,2,徐 琪1,2,程耀东1,陈 刚1. 高能物理事例级数据管理与传输系统的研究[J]. 计算机工程与应用, 2018, 54(23): 230-237. |
[3] | 刘小龙,陈 岚,李 莹,肖 京. 基于REST的园区数据管理系统的研究[J]. 计算机工程与应用, 2017, 53(18): 208-212. |
[4] | 祝永志,续士强,禹继国. 基于OpenMP/MPI并行编程模型的N体问题的优化实现[J]. 计算机工程与应用, 2016, 52(5): 16-21. |
[5] | 罗桂娥,康 霞. 固态硬盘性能优化研究与实现[J]. 计算机工程与应用, 2015, 51(1): 43-48. |
[6] | 徐永士1,2,臧冬松1,2,孙功星1. 分布式文件元数据管理系统设计[J]. 计算机工程与应用, 2012, 48(7): 1-4. |
[7] | 刘正涛1,2,王建东1. Web数据空间技术研究[J]. 计算机工程与应用, 2012, 48(7): 12-19. |
[8] | 徐永士1,2,霍 菁1,2,孙功星1. 一种数据本地化存储与处理系统[J]. 计算机工程与应用, 2012, 48(5): 7-11. |
[9] | 黄华林1,2,钟 诚1. 多核机群上数据密集型应用并行程序性能优化[J]. 计算机工程与应用, 2012, 48(30): 73-77. |
[10] | 韦兴柳1,钟 诚1,李 智1,2,蔡德霞1,陈清媛1. 大数据文件和混合文件的多线程并行下载[J]. 计算机工程与应用, 2012, 48(14): 84-89. |
[11] | 韩 超1,2,那文武1,2,刘振军1,许 鲁1. BW-RAID系统的重构技术及优化研究[J]. 计算机工程与应用, 2012, 48(1): 60-63. |
[12] | 方水良,付 伟. PDM组件开发及其在多种应用软件中的集成[J]. 计算机工程与应用, 2011, 47(6): 58-61. |
[13] | 张泽宝,张健沛,杨 静. 改进聚类的索引建立方法研究[J]. 计算机工程与应用, 2010, 46(2): 106-108. |
[14] | 孙长乐,郭东明,高 航,邹灵浩. 智能数据批量处理控制模型 [J]. 计算机工程与应用, 2010, 46(13): 224-228. |
[15] | 邱 华,黄少珉,张 萌. 提高Nand Flash性能的方法[J]. 计算机工程与应用, 2009, 45(8): 84-86. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||