计算机工程与应用 ›› 2013, Vol. 49 ›› Issue (19): 39-43.

• 理论研究、研发设计 • 上一篇    下一篇

异构环境下增强的自适应MapReduce调度算法

杨立身,余丽萍   

  1. 河南理工大学 计算机科学与技术学院,河南 焦作 454000
  • 出版日期:2013-10-01 发布日期:2015-04-20

Enhanced adaptive MapReduce scheduling algorithm in heterogeneous environment

YANG Lishen, YU Liping   

  1. College of Computer Sciences and Technology, Henan Polytechnic University, Jiaozuo, Henan 454000, China
  • Online:2013-10-01 Published:2015-04-20

摘要: 针对Hadoop默认调度算法和异构环境下LATE调度算法的不足,在SAMR调度算法的基础上提出了一种增强的自适应MapReduce调度算法。该算法记录了每个节点的历史信息,采用K-means聚类算法动态地调整阶段进度值以找到真正需要启动备份的落后任务。实验结果表明,增强自适应的MapReduce调度算法在提高任务执行时间的估算误差以及准确识别慢任务方面具有一定的有效性。

关键词: MapReduce, 推测执行, 异构环境, K-means算法

Abstract: Aiming at the shortage of Hadoop default scheduling algorithm and LATE scheduling algorithm of heterogeneous environment, this paper proposes an enhanced adaptive MapReduce scheduling algorithm on the basis of SAMR scheduling algorithm. The algorithm records the history information of each node, and uses K-means clustering algorithm to dynamically adjust the progress value, aims to find the slow tasks which are really need begin back-up. Finally, the experimental results show that the enhanced MapReduce scheduling algorithm has some validity in the aspect of improving the estimation error of the tasks’ execution time and accurately identifying the slow tasks.

Key words: MapReduce, speculative execution, heterogeneous environment, K-means algorithm