GFS distributed file system cluster (theory section)

GlusterFS Overview

GlusterFS Profile

  • Open source distributed file system
  • By the storage server, client and NFS / Samba storage gateways.
  • No metadata server

GlusterFS features

  • Scalability and high performance
  • High Availability
  • Global unified namespace
  • Elastic volume management
  • Standards-based protocols

GlusterFS term

  • Brick: a storage node
  • Volume: Volume
  • FUSE: kernel module, the client interaction module
  • VFS: virtual port
  • Glusterd: Services

The modular stack architecture

  • The modular, stackable architecture
  • By the combination of modules, responsible for realization of functions

GlusterFS operating mode

GlusterFS workflow

GFS distributed file system cluster (theory section)

  • Application: By client or application to access the data mount point GlusterFSync
  • VFS: linux kernel request is received and processed by the VFS API
  • FUSE: VFS kernel FUSE data submitted to the file system, fuse data is delivered to the file system sucked through the / dev / fuse GlusterFS client terminal device file
  • GlusterFS Client; transferring data to a remote GlusterFS Server via the network, and the server is written onto the storage device

Elastic HASH algorithm

  • HASH algorithm by a 32-bit integer
  • It is divided into N consecutive sub-spaces, each space corresponds to a Brick
  • HASH algorithm of the advantages of elasticity
    • The average data is distributed to ensure that each of the Brink
    • Lazy solved according to the metadata server, and then solve the access bottleneck and single point of failure

GlusterFS volume type

Distributed volume

  • No file into blocks
  • HASH value of the property saved by file extension
  • Supported by the underlying file system ext3, ext4, ZFS, XFS, etc.

Distributed volume features

  • Files are distributed in different servers. It does not have the redundancy
  • More easily and inexpensively expanded volume size
  • Single point of failure can cause data loss
  • Depending on the underlying data protection lazy

Creating a distributed volume

  • Creating a distributed volume name is dis-volume, documents will be distributed in accordance with HASH server1: / dir1, server2: / dir2 and server3: / dir3 in
    gluster volume create dis-volume server1:/dir1 server2:/dir2

Striped Volume

  • The offset of the file into N blocks (N bands with nodes), Server node stored in each polling Brick
  • When storing large files, especially performance
  • Does not have the redundancy, similar Raid0

Feature

  • Data is divided into smaller blocks distributed to different blocks stripe server farm
  • Reduce the load distribution and smaller file access speed accelerates
  • No data redundancy

Create a striped volume

  • Creating a named Stripe-volume of the striped volume, the file will be stored in the block polling Server1: / dir1 and Server2: dir2 Brick in two
    gluster volume create stripe-volume stripe 2 transport tcp server1:/dir1 server2:/dir2

Volume Copy

  • Save a copy of the same file or pay would copy
  • Because the copy mode to save a copy, so the lower disk utilization
  • A plurality of storage nodes are inconsistent, then the capacity of the lowest node taking effect as a cask according to the total capacity of the volume

Feature

  • All volume servers hold a complete copy
  • The number of copies of the volume can be created when customers decide
  • At least two blocks or more servers servers
  • With redundancy

Create a copy volumes

  • Create a copy of a volume called rep-volume, two copies of the file will be stored simultaneously
    gluster volume create rep-volume replica 2 transport tcp server1:/dir1 server2:/dir2

Distributed striped volume

  • Taking into account the features of distributed and striped volumes
  • Mainly for large file access process
  • At least a minimum of four servers

Creating a striped volume distributed

  • When creating a striped volume distributed called dis-stripe and arranged distributed striped volume, volume number storage server included Brink must be a multiple of the number of tape strips (> = 2 times)
    gluster volume create rep-volume stripe 2 transport tcp server1:/dir1 server2:/dir2 server3:/dir1 server4:/dir2

Distributed Replicated Volume

  • Taking into account the features of distributed volumes and copy volumes
  • Case for the need for redundancy

Creating a distributed replicated volumes

  • When creating a striped volume distributed called dis-rep, the replication volume distributed configuration striped volume, the storage server Brink volume contained must be a multiple of the number of tape strips (> = 2 times)
    gluster volume create rep-volume replica 2 transport tcp server1:/dir1 server2:/dir2 server3:/dir1 server4:/dir2

Guess you like

Origin blog.51cto.com/14473285/2461885