计算机工程与应用 ›› 2014, Vol. 50 ›› Issue (10): 1-6.

• 博士论坛 • 上一篇    下一篇

海量存储系统的数据分布策略研究

黄秋兰,武  杰,程耀东,陈  刚   

  1. 中国科学院 高能物理研究所 计算中心,北京 100049
  • 出版日期:2014-05-15 发布日期:2014-05-14

Research on data distribution policy for mass storage system

HUANG Qiulan, WU Jie, CHENG Yaodong, CHEN Gang   

  1. Computing Center, the Institute of High Energy Physics, Chinese Academy of Sciences, Beijing 100049, China
  • Online:2014-05-15 Published:2014-05-14

摘要: 针对海量存储系统中数据分布存在可扩展性以及灵活性的问题,提出一种高效的数据分布算法。该算法采用一致性哈希的存储思想,利用“二分”的映射方式映射物理存储节点,摒弃了Chord算法中每台节点对路由表维护的做法,实现[O(1)]时间内直接路由。该算法还采用了“微分逼近”的思想,实现数据的均匀分布性。实验结果证明,TTD算法具备数据分布无关性的特点,且当物理节点逼近[2N(N>0)]时,数据分布就会越均匀。反之,可以通过虚拟节点的引入,确保数据的均匀分布。算法改进了海量存储系统中数据分布的均匀程度,有效优化了系统的整体性能。

关键词: 海量存储系统, 一致性哈希, 数据分布, Chord算法

Abstract: Considering the problem of scalability and flexibility for data distribution of mass storage system, this paper proposes an effective data distribution algorithm. Based on consistent hash idea, this strategy adopts bisection mapping data and physical storage nodes, eliminating maintaining route table in each node of Chord algorithm, to achieve [O(1)] time to be routed directly. On the other hand, the algorithm also uses a "differential approximation" thinking, to achieve uniform of data distribution. Experimental results show that, TTD algorithm has data distribution-independent, and when the physical nodes approach [2N(N>0)], the data will be more evenly distributed. Conversely, the algorithm introduces virtual nodes to ensure uniform data distribution. The algorithm improves the uniformity of data distribution of mass storage system, effectively optimizes overall system performance.

Key words: mass storage system, consistent Hash, data distribution, Chord