1. Introduction to Ceph Distributed Storage System

1. Introduction to Ceph

Ceph is a unified distributed storage system that provides better performance, reliability and scalability

Two, Ceph and other storage comparison

img

Three, Ceph features

High performance :

a. Abandoning the traditional centralized storage metadata addressing scheme, using the CRUSH algorithm, data distribution is balanced, and the parallelism is high.

b. Taking into account the isolation of disaster recovery domains, it can realize the copy placement rules of various loads, such as cross-machine room, rack awareness, etc.

c. It can support the scale of thousands of storage nodes and support terabytes to petabytes of data.

High scalability : unlimited expansion

Rich features : supports three storage interfaces: block storage, file storage, and object storage

Three kinds of interfaces are supported : Object : has a native API, and is also compatible with Swift and S3 APIs, Block : supports streamlined configuration, snapshot, and clone. File : Posix interface, support snapshot

Four, Ceph architecture

Insert picture description here

<1> The bottom layer of Ceph is RADOS (Distributed Object Storage System), which has the characteristics of reliability, intelligence, and distribution. It realizes functions such as high reliability, high scalability, high performance, and high automation, and finally stores user data. The RADOS system is mainly composed of two parts, namely OSD and Monitor.
<2> On top of RADOS is LIBRADOS. LIBRADOS is a library that allows applications to interact with the RADOS system by accessing the library, and supports multiple programming languages, such as C, C++, Python, etc.
<3> There are three interfaces developed based on the LIBRADOS layer, namely RADOSGW, librbd and MDS.
<4> RADOSGW is a set of gateways based on the currently popular RESTFUL protocol, supports object storage, and is compatible with S3 and Swift.
<5> librbd provides a distributed block storage device interface and supports block storage.
<6> MDS provides a POSIX-compatible file system and supports file storage.

Five, Ceph core components

Montior : A Ceph cluster requires multiple monitors to form a small cluster. They synchronize data through Paxos to store the metadata of the OSD.

OSD : Responsible for returning specific data processes in response to client requests. A Ceph cluster generally has multiple OSDs.

The relationship between Pool, PG and OSD:

  • There are many PGs in a pool;
  • A PG contains a bunch of objects, and an object can only belong to one PG;
  • PG is divided into master and slave, and one PG is distributed on different OSDs (for three copy types)

MDS : It is a dependent metadata service of the CephFS service.

RBD : Ceph provides external block device services, and the interface is compatible with iSCSI.

RGW : Ceph provides external object storage services, and the interface S3 is compatible with Swift.

CephFS : Ceph provides external file system services.

RADOS : The essence of Ceph cluster, users realize data distribution, Failover and other cluster operations.

Libradio : Librados is a library provided by Rados, because RADOS is a protocol that is difficult to access directly, so the upper RBD, RGW and CephFS are all accessed through librados, and currently supports PHP, Ruby, Java, Python, C and C++.

CRUSH : The data distribution algorithm used by Ceph, similar to consistent hashing, allows data to be distributed to the expected place.

Pool : Pool is a logical partition for storing objects.

PG : Logical thoughts, a PG contains multiple OSDs. The introduction of the PG layer is actually for better distribution of data and positioning data.

Object : The lowest storage unit of Ceph is Object, and each Object contains metadata and raw data.

Six, Ceph architecture display

Ceph functional modules
Insert picture description here

ceph resource division
Insert picture description here

Seven, Ceph three storage types

1. Block storage (RBD)

  • advantage:
    • Data protection is provided by means such as Raid and LVM;
    • Combine multiple cheap hard drives to increase capacity;
    • Logical disks composed of multiple disks can improve read and write efficiency;
  • Disadvantages:
    • When the SAN architecture is used for networking, fiber optic switches are costly;
    • Data cannot be shared between hosts;
  • scenes to be used
    • Docker container, virtual machine disk storage allocation;
    • Log storage
    • File storage

2. File storage (CephFS)

  • advantage:
    • Low cost, just a single machine is enough;
    • Convenient file sharing;
  • Disadvantages:
    • Low read and write rate;
    • Slow transmission rate;
  • scenes to be used
    • Log storage
    • FTP、NFS;
    • Other file storage with directory structure

3. Object storage (Object) (suitable for updating data with less changes)

  • advantage:
    • With high-speed reading and writing of block storage;
    • Possesses features such as file storage sharing;
  • scenes to be used
    • Picture storage;
    • Video storage

Guess you like

Origin blog.csdn.net/weixin_43357497/article/details/111827958