《Hadoop权威指南(第四版英文版)》—— HDFS学习笔记

When a dataset outgrows(过大而不适用于) the storage capacity of a single physical machine, it becomes necessary to partition(分割分布) it across a number of separate machines(大量独立的机器).

Filesystems that manage the storage across a network of machines are called distributed filesystems.

Since they are network based(由于基于网路), all the complications of network programming kick in, thus making distributed filesystems more complex than regular disk filesystems.

For example, one of the biggest challenges is making the filesystem tolerate node failure without suffering data loss(能够容忍节点故障而不丢失数据).
Hadoop comes with(自带) a distributed filesystem called HDFS, which stands for Hadoop Distributed Filesystem. 

1.1 The Design of HDFS

HDFS is a filesystem designed for storing(用来存储) very large files with streaming data access
patterns(流式数据访问模式)
, running on clusters of commodity hardware(商用的硬件集群).

猜你喜欢

转载自blog.csdn.net/wydyd110/article/details/81631308
今日推荐