Two problems
- The data block is generally set to 128M
- Today's NameNode has two nodes that solve today's single point problem
HDFS write process
- The client initiates a data write request to the NameNode
- Write to the DataNode node in blocks, and the DataNode automatically completes the copy backup
- The DataNode reports to the NameNode that the storage is complete, and the NameNode informs the client
HDFS read process
- The client initiates a data read request to the NameNode
- NameNode finds the closest DataNode node information
- The client downloads files in chunks from the DataNode