Ceph — Introduction

Ceph

Ceph is a unified distributed storage system, which originated from the work of Sage during his Ph.D. (the earliest results were published in 2004), and then contributed to the open source community. It is designed to provide better performance, reliability, and scalability. After years of development, it has been supported by many cloud computing vendors and widely used. Both RedHat and OpenStack can be integrated with Ceph to support the back-end storage of virtual machine images.

Advantages of Ceph

high performance

  • Abandoning the traditional centralized storage metadata addressing scheme, using the CRUSH algorithm, the data distribution is balanced, and the degree of parallelism is high
  • Considering the isolation of disaster recovery domains, it is possible to implement replica placement rules for various loads, such as cross-computer room, rack awareness, etc.
  • It can support the scale of thousands of storage nodes. Support terabytes to petabytes of data

high availability

  • The number of copies can be flexibly controlled
  • Support fault domain separation and strong data consistency
  • Automatic repair and self-healing in various fault scenarios
  • No single point of failure, automatic management

high scalability

  • decentralized
  • Expandable and flexible
  • Performance increases linearly as nodes increase

Rich in features

  • Supports three storage interfaces: object storage, block device storage, and file storage
  • Support custom interface, support multiple language drivers

Ceph components

Monitors : Ceph Monitor (ceph-mon) maintains cluster state maps, including monitor maps, manager maps, OSD maps, MDS maps, and CRUSH maps. These maps are the critical cluster state required for Ceph daemons to coordinate with each other. The monitor is also responsible for managing authentication between daemons and clients. For redundancy and high availability, usually at least three Monitors are required.
Managers : The Ceph Manager (ceph -mgr) is responsible for tracking runtime metrics and the current state of the Ceph cluster, including storage utilization, current performance metrics, and system load. Ceph Manager also hosts python-based modules to manage and expose Ceph cluster information, including the web-based Ceph Dashboard and REST API. High availability usually requires at least two Managers.
Ceph OSDs : Object Storage (Ceph OSD, ceph-osd) stores data, handles data replication, recovery, rebalancing, and provides some monitoring information to Ceph monitors and managers by checking the heartbeat of other Ceph OSD processes. Typically at least three Ceph OSDs are required for redundancy and high availability.
MDSs : Ceph metadata servers (MDS, ceph-mds) store metadata when using the Ceph file system (Ceph block devices and Ceph object storage do not use MDS). The Ceph Metadata Server allows POSIX filesystem users to execute basic commands (like ls, find, etc.) without imposing a huge burden on the Ceph Storage Cluster.

Ceph stores data as objects in logical storage pools. Using the CRUSH algorithm, Ceph calculates which placement group (PG) should contain the object and which OSD should store the placement group. The CRUSH algorithm enables Ceph storage clusters to dynamically scale, rebalance, and recover.

Guess you like

Origin blog.csdn.net/weixin_45804031/article/details/127761063