hadoop study notes (b): hdfs advantages and disadvantages

 

advantage

 

 

Wherein the 10k +, referring to each of the blocks must be> = 1M

 

Shortcoming

 

Low latency: means hadoop data processing units are in minutes, rather than as a storm is in units of milliseconds.

High Throughput: refers to the size of your file block distributed storage must be minimized is 1M, can not be small.

Small file access problem: If 200 million documents, although very large, but each file is very small, so each one will still consume NameNode memory, so this time is not conducive to NameNode, so when the files are very small when the treatment is not suitable for use hadoop.

The data is written: only supports write-once, late can not be changed, but you can append the last block, the data can be read many times.

 

Guess you like

Origin www.cnblogs.com/isme-zjh/p/11532856.html