"Building and using big data clusters" background knowledge: Introduction to HDFS

Introduction to HDFS and its basic concepts

When the size of the data set to be stored exceeds the storage capacity of an independent physical computer, the data needs to be partitioned and stored on several computers. The file system stored across multiple computers in the management network is collectively called a distributed file system (distributed filesystem) .

Due to its cross-computer nature, the distributed file system relies on network transmission , which is bound to be more complicated than ordinary local file systems. For example, how to make the file system tolerant of node failures and ensure no data loss is a very big challenge.

1. Basic concepts of HDFS

  HDFS (Hadoop Distributed File System) is an important part of the Hadoop ecosystem and a storage component in Hadoop . It has an extraordinary position in the entire Hadoop and is the most basic part because it involves data storage, MapReduce, etc. Computational models all depend on data stored in HDFS. HDFS is a distributed file system that stores very large files in a streaming data access mode, and stores data in blocks on different machines in a commercial hardware cluster.

  Here we focus on several concepts involved: (1) Oversized files . The current Hadoop cluster can store hundreds of terabytes or even petabytes of data. (2) Streaming data access . The access mode of HDFS is: write once, read many times , and pay more attention to the overall time of reading the entire data set. (3) Commercial hardware . HDFS cluster equipment does not need to be expensive or special, as long as it is some common hardware used in daily use. Because of this, the possibility of hdfs node failure is still very high, so there must be a mechanism to deal with this single point of failure . Ensure data reliability. (4) Data access with low latency is not supported . HDFS is concerned with high data throughput and is not suitable for applications that require low-latency data access. (5) Write by a single user and do not support arbitrary modification . The data of hdfs is mainly read and only supports a single writer, and the write operation is always appended at the end of the text in the form of addition, and does not support modification at any position.

1. Block (data block)

Each disk has a default data block size, which is the smallest unit for data read and write by the file system. HDFS also has the concept of data blocks. The default size of a block (block) is 128MB (HDFS blocks are so large mainly to minimize addressing overhead). Files to be stored in HDFS can be divided into multiple blocks, each A block can become an independent storage unit. Unlike local disks, files smaller than a block size in HDFS do not occupy the entire HDFS data block.

Chunking HDFS storage has many benefits:

  • The size of a file can be larger than the capacity of any disk in the network, and the blocks of the file can be stored on any disk in the cluster.
  • Using abstract blocks instead of entire files as storage units simplifies storage management, allowing file metadata to be managed separately.
  • Redundant backup. Data blocks are ideal for data backup, which in turn provides data fault tolerance and increased availability. Each block can have multiple backups (the default is three), which are stored on independent machines, so that a single point of failure will not cause data loss.

2. namenode (name node)

The nodes of the HDFS cluster are divided into two types: namenode and datanode, which operate in the mode of management node-worker node, that is, one namenode and multiple datanodes. Understanding these two types of nodes is very important for understanding the working mechanism of HDFS.

NameNode is the master server of HDFS cluster, usually called name node or master node. Once the NameNode is down, the Hadoop cluster cannot be accessed. NameNode is mainly managed and stored in the form of metadata, which is used to maintain file system names and manage client access to files; NameNode records any changes to the file system namespace or its attributes; HDFS is responsible for the management of the entire data cluster, And the number of backups can be set in the configuration file, and this information is stored by the NameNode.

As a management node, namenode is responsible for the namespace of the entire file system, and maintains the file system tree and all files and directories in the entire tree. The information is permanent in the form of two files (namespace mirror file and edit log file) Stored on the namenode's local disk. In addition, at the same time, the namenode also records the data node information of each block in each file, but does not permanently store the location information of the block, because the block information can be reconstructed when the system starts.

It can be seen that, as a management node, the namenode has an extraordinary status. Once the namenode goes down, all files will be lost, because the namenode is the only node that stores metadata, the corresponding relationship between files and data blocks, and all files The information is stored here, and the file cannot be rebuilt after the namenode is destroyed. Therefore, the fault tolerance of the namenode must be highly valued.

In order to make namenode more reliable, Hadoop provides two mechanisms:

  • The first mechanism is to back up the files that make up the persistent state of the file system metadata. For example, when the file system information is written to the local disk, it is also written to a remotely mounted network file system (NFS). These write operations are performed in real time. Synchronized and guaranteed atomic.
  • The second mechanism is to run a secondary namenode, which holds a copy of the namespace mirror, to be enabled in the event of a namenode failure. (It is also possible to use a hot standby namenode instead of a secondary namenode).

3. datanode (data node)

DataNode is a slave server in the HDFS cluster, usually called a data node. The way the file system stores files is to divide the files into multiple data blocks. These data blocks are actually stored in the DataNode nodes, so the DataNode machines need to be configured with a large amount of disk space. It maintains constant communication with the NameNode. Under the scheduling of the client or the NameNode, the DataNode stores and retrieves data blocks, creates and deletes data blocks, and regularly sends the list of stored data blocks to the NameNode. Whenever the DataNode starts , it sends the list of data blocks it is responsible for holding to the NameNode machine.

The datanode acts as a working node of the file system, stores and retrieves data blocks as needed, and periodically sends a list of the blocks they store to the namenode.

4. Rack (rack)

Rack is used to store the racks for deploying Hadoop cluster servers. Nodes between different racks communicate through switches. HDFS uses rack-aware strategies to enable NameNode to determine the rack ID to which each DataNode belongs, and use the copy storage strategy to Improve data reliability, availability, and utilization of network bandwidth.

5. Metadata

Metadata can be divided into three types of information in terms of types. One is to maintain information about files and directories in the HDFS file system, such as file names, directory names, parent directory information, file size, creation time, modification time, etc.; the other is to record The file content stores relevant information, such as file block status, the number of copies, and the DataNode information where each copy is located; the third is used to record the information of all DataNodes in HDFS for DataNode management.

6. Block cache

Data is usually stored on the disk, but for frequently accessed files, the corresponding data blocks may be explicitly cached in the memory of the datanode and exist as an off-heap cache. Some computing tasks (such as mapreduce) can be stored in It runs on the datanode that has cached data, and takes advantage of block caching to improve the performance of read operations.

7. Federated HDFS

The namenode saves the reference relationship of each file and each data block in the file system in memory, which means that when there are enough files, the memory of the namenode will become a bottleneck restricting the horizontal expansion of the system. Hadoop2.0 introduced federated HDFS to allow the system to expand by adding namenodes. Each namenode manages a part of the file system namespace, for example: one namenode manages files under /usr, and another namenode manages files under the /share directory. .

8. High Availability of HDFS

Data loss can be prevented by backing up the file information stored by the namenode or running the auxiliary namenode, but the high availability of the system is still not guaranteed. Once the namenode has a single point of failure, it must be able to quickly start a new namenode with a copy of the file system information, and this process requires the following steps: (1) Import the copy image of the namespace into memory (2) Re-edit the log ( 3) Receive enough data block reports from datanodes to reconstruct the correspondence between data blocks and locations.

The above is actually a namenode cold start process, but in the case of a large enough data volume, this cold start may take more than 30 minutes, which is unbearable.

Starting from Hadoop 2.0, support for high availability has been added. A dual-system hot backup method is adopted. At the same time, a pair of active-standby namenode is used. When the active namenode fails, the standby namenode can quickly take over its tasks without any interruption, so that users cannot notice it at all.

In order to achieve this dual-machine hot backup, the HDFS architecture needs to make the following changes:

  • The editing log should be shared between the two namenodes through highly available shared storage
  • The datanode wants to send the report information of the data block to the two namenodes at the same time
  • The client should use a specific mechanism to deal with the failure of the namenode
  • The standby namenode needs to set periodic checkpoints for the active namenode to determine whether the active namenode is invalid

There is a failover controller running in the HDFS system, which manages the conversion process of the active namenode to the standby namenode. At the same time, each namenode is also running a lightweight failover controller, the main purpose is to monitor whether the host namenode fails, and to achieve rapid switching when it fails.

data source:

https://www.cnblogs.com/gzshan/p/10981007.html

HDFS Storage Architecture - Big Data Learning Course_Big Data Training_Big Data Technology and Application_Big Data Platform

Guess you like

Origin blog.csdn.net/weixin_62909516/article/details/131756658