计算机工程与应用 ›› 2017, Vol. 53 ›› Issue (24): 86-93.DOI: 10.3778/j.issn.1002-8331.1705-0213

• 大数据与云计算 • 上一篇    下一篇

多因素自适应心跳检测算法研究

易  俗1,殷慧文1,王  闯2,张一川2   

  1. 1.辽宁大学 创新创业学院,沈阳 110036
    2.东北大学 软件学院,沈阳 110819
  • 出版日期:2017-12-15 发布日期:2018-01-09

Research on multi factor adaptive heartbeat detection algorithm

YI Su1, YIN Huiwen1, WANG Chuang2, ZHANG Yichuan2   

  1. 1.College of Innovation and Entrepreneurship, Liaoning University, Shenyang 110036, China
    2.College of Software, Northeastern University, Shenyang 110819, China
  • Online:2017-12-15 Published:2018-01-09

摘要: 分布式系统中心跳检测是节点故障检测机制的关键技术之一,心跳频率设定的合理性将影响到故障检测的准确性和完整性。针对大数据环境下,分布式系统产生故障受到网络、节点、作业多方面影响,为了提高心跳频率在多方面因素影响下的合理性设定,提出一种多因素心跳检测综合指标评价模型。在该模型下同时考虑网络负载情况和节点CPU工作状态及节点作业的大小对心跳检测过程的影响。在此基础上,提出了基于多因素评价模型的自适应心跳检测算法。该算法可以随网络环境、节点CPU占用率、作业任务大小自适应地改变心跳频率,综合各因素给出心跳频率设定的最优方案。最后通过实验验证了多因素对心跳频率自适应调整的影响。

关键词: 分布式系统, 心跳检测, 多因素, 心跳频率

Abstract: Heartbeat detecting in the distributed system is one of the key technologies of node fault detection mechanism, and the rationality of the heartbeat frequency setting will affect the accuracy and integrity of the fault detection. In big data environment, fault in distributed system is affected by network, node and operation. A comprehensive evaluation model is put forward to improve the rationality of heartbeat frequency setting under the influence of multi-factors. Under this model, the influence of network, CPU occupancy rate and job size is considered simultaneously. On this basis, a self-adaptive heartbeat detecting algorithm based on multi factor evaluation model is proposed. The algorithm can adaptively change the heartbeat frequency according to the network environment, CPU occupation rate and the job size. The optimal scheme of heartbeat frequency is given by combining various factors. Finally, the effect of multi factor on the heart frequency adaptive adjustment is verified by experiments.

Key words: distributed systems, heartbeat detecting, multi factor, heartbeat frequency