GFS (Google File System, Google File System) ---- (1) File System Introduction

Distributed File System

  • The system is built on a common, inexpensive machine, the failure is the norm rather than an accident

  • System you want to store a large number of large files (single large file size)

  • The system supports two types of read operations: read large amounts of sequential and random read small-scale (large streaming reads and small random reads.)

  • Write the operating system is mainly an additional written order, rather than covering write

  • For the system to write a large number of additional concurrent client has a lot of optimization, in order to ensure the efficiency and consistency of writing, mainly due to atomic operation record append

  • System more attention to the sustainable and stable bandwidth, rather than a single reading and writing delayed GFS architecture

 

GFS architecture:

It consists of three parts: GFS master, GFS Client, GFS chunkserver. Which, GFS master at any time there is only one, and chunkserver and gfs client may have several.

A file is divided into a plurality of fixed-size chunk (default 64M), each file chunk with a globally unique handle (a 64-bit chunk ID), every chunk is copied to the (default value is a plurality chunkserver 3), in order to ensure the availability and reliability. chunkserver the chunk as a normal Linux file stored on a local disk.

GFS master the system metadata server, metadata maintenance include: command space (GFS hierarchically directory management file), the file to chunk mapping the location of the chunk. Wherein the first two will persistent, the position information of the chunk from the reporting Chunkserver.

Centralized scheduling GFS master is also responsible for distributed systems: an important chunk lease management system control, garbage collection, chunk migration. holding the master chunkserver conventional heartbeat to determine the state of chunkserver.

GFS client to the API is used by the application. GFS Client caches chunk information (i.e., metadata) read from GFS master, to minimize interaction with the GFS master.

 

The flow of data read:

  • GFS client application calls the interface provided, indicating the name of the file to be read, offset length.

  • GFS Client translated into rules in accordance with the offset chunk number, sent to the master

  • The master copy chunk chunk id and location tell GFS client

  • GFS client issues a read request to the Chunkserver recently held a copy of the request includes chunk id and scope

  • ChunkServer read the corresponding file, and then send the file contents GFS client. --------? Why is ChunkServer reading, rather than direct client to read?

Guess you like

Origin www.cnblogs.com/zhulovegou/p/11448730.html