Ceph's librados client interface and architecture analysis

The Ceph client accesses the cluster through a set of interfaces called librados. The access here includes two types of interfaces: overall access to the cluster and object access. This set of interfaces (API) includes the implementation of common languages ​​of C, C ++, and Python. The interface enables access to the Ceph cluster through the network. At the user level, you can call this interface in your own program to integrate the storage function of the Ceph cluster, or monitor the state of the Ceph cluster in the monitoring program. The relationship between the above interface and the Ceph cluster is shown in Figure 1.

 

Figure 1 Schematic diagram of the client and Ceph cluster

RADOS client API

The above interface includes almost all access functions to the Ceph cluster and its data. The so-called overall access to the cluster includes connecting the cluster, creating storage pools, deleting storage pools, and obtaining cluster status. The so-called object access is the access to the objects in the storage pool, including creating and deleting objects, writing data to the object or appending data and reading object data and other interfaces. The above functions are implemented by two classes, Rados and IoCtx. The main functions of the two classes are shown in Figure 2 (here are only examples, the actual number of interfaces is much more, please refer to the source code for details).

 

Figure 2 access interface class diagram

 

In order to understand how to use these APIs, here are some code snippets. For specific and complete code, you can refer to Ceph's official sample code.

librados::IoCtx io_ctx;
const char *pool_name = "test";
/*  创建进行IO处理的上下文,其实就是用于访问Ceph的对象 */
cluster.ioctx_create(pool_name, io_ctx);

/* 同步写对象 */
librados::bufferlist bl;
bl.append("Hello World!");  /* 对象的内容 */
/*写入对象itworld123*/
ret = io_ctx.write_full("itworld123", bl);  

/* 向对象添加属性,这里的属性与文件系统
 * 中文件的扩展属性类似。   */
librados::bufferlist attr_bl;
attr_bl.append("en_US");
io_ctx.setxattr("itworld123", "test_attr", attr_bl);
  
/* 异步读取对象内容 */
librados::bufferlist read_buf;
int read_len = 1024;
/* 创建一个异步完成类对象 */
librados::AioCompletion *read_completion = librados::Rados::aio_create_completion();
/* 发送读请求 */
io_ctx.aio_read("itworld123", read_completion, &read_buf, read_len, 0);
/* 等待请求完成 */
read_completion->wait_for_complete();
read_completion->get_return_value();
  
/* 读取对象属性 */
librados::bufferlist attr_res;
io_ctx.getxattr("itworld123", "test_attr", attr_res);

/* 删除对象的属性 */
io_ctx.rmxattr("itworld123", "test_attr");

/* 删除对象 */  
io_ctx.remove("itworld123");    

The main purpose of this article is to let everyone understand the client interface, so only the basic usage is given, and the error and other situations are not dealt with. Ceph official has the complete code, everyone can refer to it.

Client software architecture overview

The basic architecture of the librados client is shown in Figure 3, which mainly includes 4 layers, namely the API layer, the IO processing layer, the object processing layer, and the messaging layer. The API layer is an abstract layer that provides a unified interface for the upper layer. The native interface provided by the API layer includes the implementation of C and C ++, as well as the implementation of Python.

Figure 3 Basic architecture of RADOS client


The IO processing layer is used to implement simple encapsulation of IO. It is implemented by a class called ObjectOperation, which mainly includes data information for read and write operations. Then in the IO processing layer, IoCtxImpl :: operate function converts ObjectOperation to Object of Objecter :: Op class, and submits the object to the object processing layer for further processing.
The object processing layer contains the information required for Ceph object processing, including communication pipelines, OSDMap and MonMap. Therefore, here, you can calculate the specific location of the object storage based on the object information, and finally find the connection information (Session) between the client and the OSD. The interface of the
messaging layer will be called by the object processing layer. At this time, the message will be delivered to this layer and sent to the specific OSD through the thread pool of this layer. It should be noted here that the message transceiving layer and the server-side message transceiving common Messager code.
As shown in the flow chart of the core process, this article does not introduce it in detail. Specific details can be understood by reading the corresponding source code according to this process.

2.png


In this process, it should be noted that the _op_submit function will call two functions _calc_target and _get_session. The functions of the two functions are to obtain the target OSD and the corresponding Session (connection). This is the basis for sending data later.

Author: SunnyZhang the IT world
link: https: //www.jianshu.com/p/d7f69900ae8e
Source: Jane books
are copyrighted by the author. For commercial reproduction, please contact the author for authorization, and for non-commercial reproduction, please indicate the source.

发布了13 篇原创文章 · 获赞 6 · 访问量 1万+

Guess you like

Origin blog.csdn.net/majianting/article/details/102989966