1.hadoop Past and Present:
1) Search Engine: crawler + index server (generated index + retrieval)
2) Doung Cutting
3) Nutch
a. Distributed Storage
b. Distributed Computing
4) GFS paper doung cutting wrote hdfs
2.hadoop Overview
hadoop common: providing a communication network
hadoop hdfs
hadoop mapreduce
hadoop yarn
Hadoop 0.x 1.x
After the release of Hadoop 2.x
Hadoop Overview
hdfs introduced
NameNode master node is storing the metadata, and location of each file DataNode block list and the block is located
DataNode storing the data block and the check and
SecondaryNamenode background monitoring data, take a snapshot
1) Four Modules
2) hdfs (hadoop distributed file system): Distributed File System
. A file system: file management block management + block
Stand-alone file system
window:FAT16、FAT32、NTFS
linux:ext 2/3/4、 VFS
b. Distributed File System
Multiple server file system
c. Three components
NameNode
- Metadata: file name, directory name, property
- The relationship between the file list and block list
- block mapping relationship with datanode list
DataNode
block block of data, the checksum encoding
SecondaryNameNode
Namenode pressure to share, merge and edit log edits image files fsimage, the merger will eventually file for processing returns namenode
Hadoop Overview
YARN introduced Case