HDFS分散式檔案系統容錯管理架構

機構典藏 > College of Engineering > Graduate Institute & Department of Computer Science and Information Engineering > Thesis > Item 987654321/94394

Please use this identifier to cite or link to this item: https://tkuir.lib.tku.edu.tw/dspace/handle/987654321/94394

Title:	HDFS分散式檔案系統容錯管理架構
Other Titles:	Fault-tolerant management framework for hadoop distributed file system
Authors:	廖治凱;Liao, Jhih-Kai
Contributors:	淡江大學資訊工程學系碩士班林其誼;Lin, Chi-Yi
Keywords:	HDFS;Sub_NameNode;Centroid Point;Routing hops
Date:	2013
Issue Date:	2014-01-23 14:33:13 (UTC+8)
Abstract:	由於現今網路發展迅速，大量的應用從原本的單機操作轉為透過網路多機操作，也促使了雲端運算技術的發展。例如由Yahoo出資研發的Hadoop、MapReduce，Google開發的GFS、Big table…等。其中Hadoop所使用的Hadoop Distributed File System(HDFS)，主要使用Master/Slave的架構配置，由單一NameNode來管理整個系統，多個DataNode來負責幫助系統儲存資料。在此種配置之下，由單一節點掌握大量重要的Metadata資料，若此節點發生錯誤，造成資料檔案毀損，整個系統會因此而無法正常運作並發生Single Point of Failure(SPOF)的問題，SPOF對於整個系統而言會造成巨大的損失。並且在傳統Master/Slave配置的HDFS當中，所有的要求以及回應都必須要經過Master Node處理，因此造成網路上大量的資料都往NameNode擁入，使得網路速度緩慢，來回資料傳遞耗時。使得整體系統效能不彰。因此，在本研究當中以Job為單位，每個Job動態配置一個Sub_NameNode來負責管理此Job，藉此來舒緩網路壅塞的情形，同時也加快了Master和Slave之間溝通的速度，並且將Metadata分散到不同的節點同時也分散了資料毀損的風險。有效地把會發生Single Point of Failure的點分為兩種，分別為NameNode和Sub_NameNode，並且針對不同種的SPOF節點皆提出有效解決SPOF的方法，降低其所帶來的影響。 Due to the rapid development of modern Internet, the mode of operation of a large number of applications has changed from single-machine to a cluster of machines over the network. This trend also contributed to the development of cloud computing technology, among which Google invented the MapReduce framework, Google File System (GFS), and BigTable, and Yahoo invested the open-source Hadoop project to implement those technologies proposed by Google. The Hadoop Distributed File System (HDFS) is based on the master/slave model to manage the entire file system. Specifically, a single NameNode acting as the master manages a large number of slaves called DataNodes. Since the NameNode is responsible for maintaining a lot of important metadata information, a NameNode crash can render the entire file system unusable. That is, the NameNode forms a Single Point of Failure (SPOF). In addition, in the master/slave model, all the requests and responses have to go through the master. It is obvious that without load sharing, the NameNode forms a performance bottleneck. Therefore, in this research we propose to allocate Sub_NameNodes dynamically for each MapReduce job, in order to relieve the network congestion, and accelerate the speed of communication between the master and the slaves. Our approach also reduces the risk of data loss by replicating the metadata to the Sub_NameNodes. Once the NameNode fails, its state can be reconstructed from the Sub_NameNodes. The simulation results show significant reduction on both the number of communication hops and the communication time.
Appears in Collections:	[Graduate Institute & Department of Computer Science and Information Engineering] Thesis

Files in This Item:

File	Size	Format
index.html	0Kb	HTML	366	View/Open

Loading...