[A] Hadoop, HDFS Introduction and basic concepts

  When the size of the data set to be stored exceeds the storage capacity of a separate physical computer, it is necessary to partition the data and stores up to several computers. Network management across multiple computers stored in file systems collectively known as Distributed File System (Distributed the FileSystem) .

  Distributed File System because of its cross-machine characteristics, so dependent on the transmission network, is bound to be more complex than ordinary local file system, such as: how to make the file system can tolerate node failures and ensure that no data is lost, this is a very big challenge.


  This article is equivalent to "Hadoop The Definitive Guide," the study notes.

(A) HDFS Introduction and basic concepts

  HDFS (Hadoop Distributed File System) is an important part hadoop ecosystem is hadoop the storage components, the position in the Hadoop unusual, most part of the foundation, as it relates to data storage, and so the MapReduce calculation model must rely on the data stored in the HDFS. HDFS is a distributed file system to store large files streaming data access mode, the memory block to the data on different machines in a cluster commercial hardware.

  Here we focus on several concepts which relate to: (1) large files . The current hadoop cluster can store hundreds of TB or even PB-level data. (2) the streaming data access . HDFS access mode is: write once, read many times , more attention is to read the entire data set of the whole time. (3) commodity hardware. HDFS clusters of how expensive equipment and does not require special hardware can be as long as some ordinary everyday use, and as such, the possibility of hdfs node failure is very high, so there must be mechanisms to deal with this single point of failure , a reliable guarantee for the data. (4) does not support the low latency data access time . hdfs concern is the high data throughput, not suitable for those applications that require low latency data access time. (5) single-user writing is not supported any modification. hdfs read-mostly data, only supports a single writer, and write operations are always added in the form of the text is added at the end, in the modification is not supported anywhere.

1, HDFS block

  Each disk has a default block size, which is the smallest unit of read and write the file system data. This involves appropriate knowledge of the disk, where we will not say, finishing behind a blog to record knowledge about the corresponding disk.

  HDFS also has the concept of a data block, a default block (block) size of 128MB (HDFS such a large block is mainly to minimize the addressing overhead), to be stored in HDFS file may be divided into a plurality of sub-blocks, each sub-block may be a separate memory unit. The difference is that the local disk, HDFS less than a block size of the file does not occupy the entire block of data HDFS.

  HDFS is divided into blocks of memory has many advantages:

  • A file size may be larger than any one network capacity of the disk, a file may be stored block by any of a cluster disk.
  • Abstract block, instead of the entire file as a storage unit, the storage management can be simplified, so that the metadata file can be managed separately.
  • Redundancy. Data block is well suited for data backup, and thus may provide improved data availability and fault tolerance. Each block may have a plurality of backup (the default is three), UC are beginning to separate from each other up the machine, so as to ensure a single point of failure will not result in data loss.

2, namenode sum datanode

  HDFS cluster nodes are divided into two categories: namenode and datanode, to manage nodes - nodes run mode operation, namely a namenode and more datanode, these two types of nodes is important in understanding the working mechanism of HDFS understand.

  namenode as a management node, which is responsible for the entire file system namespace, and maintains the file system tree and all files and directories within the whole tree, the information in the form of two documents (the namespace image file and edit the log file) permanent namenode stored on the local disk. In addition, at the same time, the node data record also NameNode information for each file in each block is located, but blocks the position information is not stored permanently, because the information block can be reconstructed upon system startup.

  datanode as a working node of the file system, according to the need to store and retrieve data blocks, they periodically send the stored list of blocks to namenode.

  Thus, namenode as a management node, and its status is unusual, once namenode down, all the files will be lost, because namenode is the only store metadata, node correspondence relationship between files and data blocks, all files information is stored here, you can not rebuild the file after namenode destroyed. Therefore, we must attach great importance to namenode of fault tolerance.

  In order to make namenode more reliable, hadoop provides two mechanisms:

  • The first mechanism is to back up those files that make up the file system metadata persistent state, for example: while the file system information written to the local disk, also written to a remotely mounted Network File System (NFS), which writes real-time synchronized and guarantee atomicity.
  • The second mechanism is a secondary namenode run, to save the namespace mirrored copy is enabled when namenode failure. (May also be used instead of the auxiliary heat backup namenode namenode).

3, the cache block

  The data is normally saved in the disk, but for frequent file access its corresponding data block may be explicitly cached datanode memory in the outer stack cache ways exist, some computational tasks (such as MapReduce) may datanode running on a data cache, the advantage of using a block cache to improve the performance of the read operation.

4, the Federal HDFS

  namenode stored in memory file system, each file and each data block reference relationship, which means that, when the file is large enough, namenode memory will be a bottleneck of the lateral restriction system. hadoop2.0 introduced Federal HDFS allows the system to achieve extended by adding namenode way, part of the space of each namenode management file system name, for example: a namenode management files in the / usr, another file management under namenode / share directory .

5, HDFS high availability

  By running the backup or file information stored namenode auxiliary namenode can prevent data loss, but still there is no guarantee the high availability of the system. Once namenode single point of failure occurs, you must be able to quickly start a new namenode information has a copy of the file system, and this process requires the following steps: (1) a copy of the namespace of the imported image memory (2) to re-edit the log ( 3) receiving a block of data from a sufficient number of reported datanode, thereby reconstructing data from the corresponding relationship block position.

  Namenode of the above is actually a cold start, but in the amount of data large enough, the cold boot may take longer than 30 minutes, this is intolerable.

  Hadoop2.0 beginning, adds support for high availability. Using the hot backup mode. While a pair of events - spare namenode, when the activities namenode failure, backup namenode can quickly take over its tasks, the middle will not have any interruptions, so that the user can not detect that.

  In order to achieve this dual hot backup, HDFS architecture need to make the following changes:

  • To edit the logs to be shared by high-availability shared storage between two namenode
  • datanode to send data to two blocks report information namenode
  • Client to use a specific mechanism to deal with failures of namenode
  • To alternate namenode namenode active checkpoint periodically arranged, which determines whether the failure event namenode

  HDFS system operation with a failover controller, manages the transfer activities namenode standby namenode conversion process. Meanwhile, each of namenode also running a lightweight failover controller, the main purpose is to monitor whether the host namenode failure, and achieve rapid switching when failure.

Guess you like

Origin www.cnblogs.com/gzshan/p/10981007.html