1. HDFS Architecture
1 HDFS assumptions
data flow access
big data set
simple correlation model
Mobile computing is cheaper than mobile data
Portability across multiple hardware and software platforms
2 Design goals of HDFS
very large distributed file system
run on common hardware
Optimize batch processing
User controls can reside in heterogeneous operating systems
Use a single namespace across the cluster
data consistency
The file is divided into small chunks
Smart Client
The program adopts the principle of " data nearness " to allocate nodes for execution
The client has no caching mechanism for files
3 HDFS Architecture
1 HDFS Architecture - File
The file is divided into blocks (the default size is 64M). The unit is block. Each block has multiple copies stored on different machines. The number of copies can be specified when the file is generated (default 3).
The NameNode is the master node, which stores file metadata such as file name, file directory structure, file attributes (generation time, number of replicas, file permissions), as well as the block list of each file and the DataNode where the block is located, etc.
The DataNode stores file block data in the local file system, as well as block data checksums.
Files can be created, deleted, moved, or renamed, and the contents of the file cannot be modified after the file is created, written, and closed.
2 HDFS file permissions
Similar to linux file permissions.
r:read;w:write;x:execute, permission x is ignored for files, and indicates whether to allow access to its contents for folders.
If the Linux system user zhangsan uses the hadoop command to create a file, the owner of the file in HDFS is zhangsan.
HDFS permissions purpose : to stop good people from doing bad things, not to stop bad people from doing bad things. HDFS believes that you tell me who you are, and I think who you are.
3 HDFS Architecture - Component Function
NameNode | DataNode |
store metadata | Store file content |
Metadata is kept in memory | File contents are saved on disk |
Save the mapping relationship between files, blocks, and DataNodes | Maintained the mapping relationship between blockId and DataNode local file |
NameNode :
It is a central server, a single node (simplifies the design and implementation of the system), responsible for managing the namespace of the file system and client access to files.
For file operations, the NameNode is responsible for the operation of file metadata , and the DataNode is responsible for processing the read and write requests of the file content. The data stream related to the file content does not pass through the NameNode, and only asks which DataNode to contact, otherwise the NameNode will become a system bottleneck.
The DataNodes on which replicas are stored are controlled by the NameNode , and block placement decisions are made based on the global situation. When reading files, the NameNode tries to let users read the most recent replicas first to reduce bandwidth consumption and read latency .
The NameNode is solely responsible for the replication of data blocks, and it periodically receives heartbeat signals and block status reports (BlockReport) from each DataNode in the cluster . Receiving a heartbeat signal means that the DataNode is working properly. The block status report contains a list of all data blocks on the DataNode.
DataNode :
A data block is stored on the disk as a file in the DataNode, including two files, one is the data itself, and the other is the metadata including the length of the data block, the checksum of the block data , and the timestamp .
After the DataNode is started, it registers with the NameNode. After passing, it reports all block information to the NameNode periodically (1 hour).
The heartbeat is once every 3 seconds , and the heartbeat returns the result with the command from the NameNode to the DataNode, such as copying the block data to another machine, or deleting a data block. If no heartbeat is received from a DataNode for more than 10 minutes, the node is considered unavailable.
It is safe to join and leave some machines while the cluster is running.
4 HDFS replica placement strategy
Before hadoop 0.17 | After hadoop 0.17 |
Replica 1: Different nodes on the same rack | Copy 1: On the same node as the Client |
Replica 2: Another node on the same rack | Replica 2: On nodes in different racks |
Replica 3: Another node on a different rack | Replica 3: Another node in the same rack as the second replica |
Other copies: randomly selected | Other copies: randomly selected |
5 HDFS data corruption (corruption) processing
When the DataNode reads the block, it calculates the checksum .
If the calculated checksum is different from the value when the block was created, the block is damaged.
Client reads blocks on other DNs.
NameData marks the block as corrupt, and then replicates the block up to the expected number of file backups .
The DataNode verifies its checksum three weeks after its file is created.
6 HDFS Architecture-Client&SNN
Client | Secondary NameNode |
file segmentation Interact with NameNode to get file location information Interact with DataNode to read or write data Manage HDFS Access HDFS |
Not a hot standby for NameNode Auxiliary NameNode, sharing its workload Periodically merge fsimage and fsedits, push to NameNode In an emergency, it can assist in the recovery of the NameNode |