题名: | A Job-Oriented Load-Distribution Scheme for Cost-Effective NameNode Service in HDFS |
作者: | Lin, Chi-Yi;Liao, Jhih-Kai |
贡献者: | 淡江大學資訊工程學系 |
关键词: | big data;cloud computing;Hadoop Distributed File System;HDFS;NameNode service;load distribution;fault tolerance;MapReduce;job-oriented scheme;load distribution |
日期: | 2014-09-01 |
上传时间: | 2014-10-20 15:48:35 (UTC+8) |
出版者: | Olney: Inderscience Publishers |
摘要: | Apache Hadoop has been widely used in big data processing and distributed computations. In the Hadoop ecosystem, data are stored and managed by the Hadoop Distributed File System (HDFS), in which the NameNode machine is a single point of failure. Although HDFS Federation and HDFS High Availability solve the problem, it comes at significant cost of extra server hardware. Therefore, we aim at improving the availability of the NameNode service in a more cost-effective way. The primary innovation is the joint consideration of MapReduce jobs and the resulting HDFS operations. Specifically, we dynamically allocate a SubNameNode for each job in one of the existing TaskTrackers to provide the NameNode service. Since the load of the single NameNode is naturally distributed to the SubNameNodes, the failure rate of the NameNode machine can be reduced. Moreover, with SubNameNodes more local to the participating TaskTrackers, TaskTrackers can access the NameNode service more efficiently. |
關聯: | International Journal of Web and Grid Services 10(4), pp.319-337 |
DOI: | 10.1504/IJWGS.2014.064933 |
显示于类别: | [資訊工程學系暨研究所] 期刊論文
|