Cloud Native Middleware--MongoDB Operator

In recent years, driven by the development of cloud-native concepts such as container technology, open source, and microservices, deploying applications to the cloud has become a general trend. When enterprise users seamlessly migrate application systems to the cloud when the existing application business logic remains unchanged, middleware plays a crucial role in application empowerment and support for upper-layer applications. part. The main features of middleware on the cloud are rapid deployment and delivery, cost reduction and efficiency increase, flexible expansion and simple management. As a non-relational database based on distributed file storage, MongoDB has great advantages in high performance, high availability, large data storage, easy deployment, and easy operation, and is used in more and more scenarios. This article uses MongoDB Operator as the entry point to introduce how middleware is designed, developed and used in cloud native.

01

Introduction to MongoDB

1.1 What is MongoDB?

MongoDB is a database based on distributed file storage. Written in C++ language, it is designed to simplify user development and extension. The data structure of a record in MongoDB is similar to a document, which consists of field and value pairs. MongoDB documents are similar to JSON objects. Field values ​​may include other documents, arrays, and arrays of documents.

Image source: https://mongodb.net.cn/Upload/crud-annotated-document.bakedsvg.svg

Its main features are as follows:

  • high performance

MongoDB provides high performance data persistence. For embedded data models, support reduces I/O activity on the database system and supports faster queries through indexes, and can include keys from embedded documents and arrays.

  • rich query language

MongoDB query API provides a rich query language to support read and write operations. At the same time, MongoDB also retains the ability of real-time query of relational databases and the ability of indexing (the bottom layer is based on Btree). Compared with the same type of NoSQL redis and None of the above abilities.

  • high availability

MongoDB itself provides a replica set, which can distribute data on multiple machines for redundancy. The purpose is to provide automatic failover and expand read capabilities.

A replica set is a group of MongoDB servers that maintain the same set of data, providing redundancy and increasing data availability.

  • horizontal scalability

MongoDB provides horizontal scalability and uses sharding technology to expand data. MongoDB can automatically shard and transfer data blocks in shards, so that the data stored in each server is the same size.

  • Support multiple storage engines

MongoDB supports multiple storage engines such as WiredTiger and in-memory storage engines. In addition, MongoDB provides a flexible storage engine API that allows third parties to develop storage engines for MongoDB.

1.2 MongoDB deployment mode

  • Standalone mode: single point mode, single node mode refers to a mongod process running on the server for reading and writing data. Users quickly deploy a MongoDB single-node server for daily development, testing, and learning.

  • Replica Set mode: Replica set mode. A replica set is a collection of mongod instances that maintain the same data set. It mainly includes three types of node roles, as shown in the following figure:

Image source: https://mongodb.net.cn/Upload/replica-set-primary-with-two-secondaries.bakedsvg.svg

a. Primary master node

A Replica Set can have multiple Secondary members, but only one node is the Primary node, and only the Primary node is readable and writable. Users can also modify the configuration, setting the Primary node to only be responsible for writing operations, and making the Secondary node Responsible for reading operations to realize the separation of reading and writing of data sets;

The client performs read and write operations on the Primary node, and then synchronizes the data to all Secondary members through the asynchronous synchronization mechanism of Replication. After a certain amount of time, all Secondary members will have the same data set.

When the Primary hangs up or is unavailable, that is, when the Replica Set detects that the Primary is inaccessible, it will start the automatic failover process. From other Secondary members, the Secondary or Arbiter (arbitration) node votes for a new member as the Primary. Receive and process client requests, so that users can continue to provide services.

b. Secondary replica node

The main function of the replica node is to back up data and participate in the election of the master when the master node is unavailable. The replica nodes have mutual heartbeat monitoring, which can perceive the overall status of the cluster, and can also be used as a data source to provide users with query functions.

c. Arbiter arbitration node

The main feature of the arbitration node is that it does not store data, will not be selected as the master, and only has the function of voting for the master. The function of using Arbiter is to reduce redundant backup of data, save resources, and provide high availability.

  • Sharding mode: sharding cluster mode, sharding is a method of distributing data across multiple machines. In fact, it is an overall architectural implementation of MongoDB's horizontal expansion. MongoDB uses sharding to support deployments with very large data sets and high-throughput operations. Solve system growth problems such as performance and capacity bottlenecks of MongoDB.

    A MongoDB shard cluster consists of three components: shard, mongos and config servers.

Image source: https://mongodb.net.cn/manual/sharding/ 

1. mongos: mongos acts as a query router, providing an interface for communication between client applications and sharded clusters. Connect upward to the client client, when receiving a write request, send the request to a certain shard cluster according to a specific algorithm, and write data. When a read request is received, locate the shard where the data object to be read is located, forward the request to this shard, and read the data.

2. Shard: It is the place to store data. The data stored in each shard is a subset of the data set in the shard cluster. Starting with MongoDB 3.6, each shard must be deployed as a replica set architecture. Theoretically, the number of replica set clusters can grow infinitely.

3. config servers: Config servers (configuration servers) mainly store the metadata and configuration information of the fragmented cluster. Starting with MongoDB 3.4, config servers must also be deployed as a replica set architecture.

Through the above introduction, we can see that the main feature of the sharding cluster mode is that the data set can be divided into multiple data blocks and stored on different sharding nodes . When the amount of data increases, users can add shards to achieve capacity expansion. At the same time, the shard cluster mode supports the ability of multiple masters and multiple slaves. The master nodes of different shards process different requests; data synchronization and backup can be performed between the master and slave of each shard. When the master node of a shard When it is unavailable, the master election function can also be automatically performed, and during the master election process, other shard master nodes can also provide services to users.

02

demand analysis

2.1 Single Point Mode Requirements

  • The MongoDB container in single-point mode can be quickly deployed through CR;

  • Can automatically maintain the status of MongoDB instance nodes;

  • You can adjust the configuration file of MongoDB;

  • You can set the MongoDB resource size;

  • Support the setting of MongoDB mirror version;

  • Support setting the root password of MongoDB;

  • Support users to set custom database, user and password;

  • It can be backed up, and the backup snapshot data supports putting into S3 storage;

  • You can set a backup plan and create backup snapshots regularly;

  • Support the persistence of MongoDB data, using PVC that supports storageClass;

  • Support update/delete MongoDB service instance;

  • Support viewing slow log queries of MongoDB services;

  • Support viewing MongoDB service event query;

  • Support external accessibility;

  • Supports common operations of resource objects such as Statefulset, Service, and Pod;

  • Support viewing real-time logs and offline logs of MongoDB instance Pod;

  • Support Pod's exec capability;

  • Support downloading log files in Pod;

  • Support the operation and maintenance operations of MongoDB's common capabilities, such as viewing mongo status, configuration information, quick search, etc.;

  • You can set the resource configuration for monitoring and monitor the running status of the MongoDB instance.

2.2 Replica set mode requirements

  • MongoDB containers in replica set mode can be quickly deployed through CR;

  • Can automatically maintain the status of MongoDB instance nodes;

  • The configuration file of MongoDB can be adjusted;

  • You can set the MongoDB resource size;

  • Support the setting of MongoDB mirror version;

  • Support setting the root password of MongoDB;

  • Support users to set custom database, user and password;

  • Support setting arbitration nodes;

  • Support rapid expansion and contraction;

  • Guarantee the exclusive ability of deployment and avoid running important services on one machine;

  • It can be backed up, and the backup snapshot data supports putting into S3 storage;

  • You can set a backup plan and create backup snapshots regularly;

  • Support the persistence of MongoDB data, using PVC that supports storageClass;

  • Support updating/deleting MongoDB service instance;

  • Support viewing slow log queries of MongoDB service;

  • Support viewing MongoDB service event query;

  • Support external accessibility;

  • Supports common operations of resource objects such as Statefulset, Service, and Pod;

  • Support viewing real-time logs and offline logs of MongoDB instance Pod;

  • Support Pod's exec capability;

  • Support for downloading log files in Pods;

  • Support viewing the topology map of Mongo nodes;

  • Support manual switching of Mongo master nodes, and view master-slave node switching records;

  • Support the operation and maintenance operations of MongoDB's common capabilities, such as viewing mongo status, configuration information, inserting data, etc.;

  • You can set the resource configuration for monitoring and monitor the running status of the MongoDB instance.

2.3 Fragmentation Mode Requirements

  • MongoDB containers in sharding mode can be quickly deployed through CR;

  • Can automatically maintain the status of MongoDB instance nodes;

  • You can adjust the configuration file of MongoDB;

  • You can set the MongoDB resource size;

  • Support the setting of MongoDB mirror version;

  • Support setting the root password of MongoDB;

  • Support users to set custom database, user and password;

  • Support setting arbitration nodes;

  • Support setting the number of mongo routes and configuration centers;

  • Guarantee the exclusive ability of deployment and avoid running important services on one machine;

  • It can be backed up, and the backup snapshot data supports putting into S3 storage;

  • You can set a backup plan and create backup snapshots regularly;

  • Support the persistence of MongoDB data, using PVC that supports storageClass;

  • Support update/delete MongoDB service instance;

  • Support to view the slow log information query of MongoDB service;

  • Support viewing MongoDB service event query;

  • Supports common operations of resource objects such as Statefulset, Service, and Pod;

  • Support viewing real-time logs and offline logs of MongoDB instance Pod;

  • Support Pod's exec capability;

  • Support downloading log files in Pod;

  • Support viewing the topology map of Mongo nodes;

  • Support manual switching of Mongo master nodes, and view master-slave node switching records;

  • Support the operation and maintenance operations of MongoDB's common capabilities, such as viewing mongo status, configuration information, inserting data, etc.;

  • You can set the resource configuration for monitoring and monitor the running status of the MongoDB instance.

03

plan

The following introduces the two modes of MongoDB forming a high-availability cluster, and the design schemes in the current operator:

  • Replica set mode scheme: use the NodePort method of K8s service as the entrance of the external connection. Users can access the mongo service through the external connection address of the cluster, and discover the master node through the capabilities of the mongoDB Client itself.

  • Fragmentation mode scheme: In the mongo high availability fragmentation cluster mode, the way of client access is to set a set of mongos addresses, and the mongo client will connect and use the cluster. mongod stores the data and configuration information of the shard cluster to ensure consistent state. At the same time, these addresses are externally reachable, and also support service connection access outside the cluster.

04

architecture

The implementation of MongoDB Operator is based on the Operator SDK technical framework. Realize the definition of CRD such as Mongo service instance, backup, master-slave switching through Operator, and monitor the running status of Mongo CR instance through Reconcile in Operator Controller, so as to continuously adjust and repair Mongo CR until it reaches the desired state, and at the same time It also includes monitoring functions for Mongo services and Operators.

The following mainly introduces the development and design ideas of MongoDB Operator from the three stages of Opertaor deployment, function development, and continuous optimization:

  • Operator deployment phase: As a developer, after determining mongo-related requirements, you must first be familiar with the usage and basic knowledge of mongo, and master Kubernetes, Operator CRD/Controller Kubebuilder/operator-sdk and other technologies before you can create a mongo operator project;

  • Opertaor function development stage: The mongo operator function mainly includes CRD design definitions such as mongo instances, backups, backup plans, and master-slave switching, as well as the design of corresponding monitoring indicators and alarm rules, to realize the monitoring and alarm functions for Mongo services and Operators;

  • Opertaor continuous optimization stage: In order to realize the corresponding resource management of mongo CRD, through the Reconclie mechanism of mongo operator, realize the monitoring and automatic maintenance of CR instance, so that CR can reach the user's expected state.

According to the above ideas, the architecture diagram of MongoDB Operator is as follows:

The core ideas of Reconcile are as follows:

05

the case

5.1 Model example

  • single point model

// MongoDBSpec defines the desired state of MongoDB// Type 类型为 Standalonetype MongoDBSpec struct {
   
     Image               string                       `json:"image,omitempty"`  // +kubebuilder:validation:Enum=Standalone;ReplicaSet;ShardedCluster  Type                string                       `json:"type,omitempty"`  Service             string                       `json:"service,omitempty"`  RootPassword        string                       `json:"rootPassword,omitempty"`  DBUserSpec          DBUserSpec                   `json:"dbUserSpec,omitempty"`  NotPersistent       bool                         `json:"notPersistent,omitempty"`  CustomConfig        string                       `json:"customConfig,omitempty"`  ExportConnect       bool                         `json:"exportConnect,omitempty"`  Resources           *corev1.ResourceRequirements `json:"resources,omitempty"`  Storage             string                       `json:"storage,omitempty"`  BackUpStorage       string                       `json:"backUpStorage,omitempty"`  StorageClassName    string                       `json:"storageClassName,omitempty"`  MetricsExporterSpec MetricsExporterSpec          `json:"metricsExporterSpec,omitempty"`  PodSpec             PodSpec                      `json:"podSpec,omitempty"`}
  • replica set model

// MongoDBSpec defines the desired state of MongoDB// Type 类型为 ReplicaSettype MongoDBSpec struct {
   
     Image               string                       `json:"image,omitempty"`  // +kubebuilder:validation:Enum=Standalone;ReplicaSet;ShardedCluster  Type                string                       `json:"type,omitempty"`  Members             int                          `json:"members,omitempty"`  Service             string                       `json:"service,omitempty"`  RootPassword        string                       `json:"rootPassword,omitempty"`  DBUserSpec          DBUserSpec                   `json:"dbUserSpec,omitempty"`  Arbiter             bool                         `json:"arbiter,omitempty"`  NotPersistent       bool                         `json:"notPersistent,omitempty"`  CustomConfig        string                       `json:"customConfig,omitempty"`  ExportConnect       bool                         `json:"exportConnect,omitempty"`  Resources           *corev1.ResourceRequirements `json:"resources,omitempty"`  Storage             string                       `json:"storage,omitempty"`  BackUpStorage       string                       `json:"backUpStorage,omitempty"`  StorageClassName    string                       `json:"storageClassName,omitempty"`  MetricsExporterSpec MetricsExporterSpec          `json:"metricsExporterSpec,omitempty"`  PodSpec             PodSpec                      `json:"podSpec,omitempty"`}
  • Sharded cluster model

// MongoDBSpec defines the desired state of MongoDB// Type 类型为 ShardedClustertype MongoDBSpec struct {
   
     Image                string                       `json:"image,omitempty"`  // +kubebuilder:validation:Enum=Standalone;ReplicaSet;ShardedCluster  Type                 string                       `json:"type,omitempty"`  ShardCount           int                          `json:"shardCount,omitempty"`  MongodsPerShardCount int                          `json:"mongodsPerShardCount,omitempty"`  MongosCount          int                          `json:"mongosCount,omitempty"`  ConfigServerCount    int                          `json:"configServerCount,omitempty"`  Service              string                       `json:"service,omitempty"`  RootPassword         string                       `json:"rootPassword,omitempty"`  DBUserSpec           DBUserSpec                   `json:"dbUserSpec,omitempty"`  Arbiter              bool                         `json:"arbiter,omitempty"`  NotPersistent        bool                         `json:"notPersistent,omitempty"`  CustomConfig         string                       `json:"customConfig,omitempty"`  Resources            *corev1.ResourceRequirements `json:"resources,omitempty"`  Storage              string                       `json:"storage,omitempty"`  BackUpStorage        string                       `json:"backUpStorage,omitempty"`  StorageClassName     string                       `json:"storageClassName,omitempty"`  MetricsExporterSpec  MetricsExporterSpec          `json:"metricsExporterSpec,omitempty"`  PodSpec              PodSpec                      `json:"podSpec,omitempty"`}
  • backup model

// MongoBackUpSpec defines the desired state of MongoBackUptype MongoBackUpSpec struct {
   
     Instance       string `json:"instance"`  Restart        bool   `json:"restart,omitempty"`  ChangeMasterTo string `json:"changeMasterTo,omitempty"`}
  • Master-slave node switching model

type MongoChannelSpec struct {
   
     Instance string `json:"instance"`  Restart        bool   `json:"restart,omitempty"`  ChangeMasterTo string `json:"changeMasterTo,omitempty"` }

5.2 Example of use

  • Single point deployment/backup

apiVersion: mongo.daocloud.io/v1alpha1kind: MongoDBmetadata:  name: mongodb-testspec:  type: Standalone  image: daocloud.io/atsctoo/mongo:3.6  service: mongodb-001-svc  rootPassword: "654321"  dbUserSpec:    enable: true    name: mongo001    user: user001    password: "123456"  notPersistent: true  customConfig: mongo-operator-mongo-default-config  resources:    limits:      cpu: "1"      memory: 512Mi    requests:      cpu: "1"      memory: 512Mi  storage: 1Gi  backUpStorage: 1Gi  storageClassName: "nfs"  metricsExporterSpec:    enable: true    resources:      limits:        cpu: "0.1"        memory: 128Mi      requests:        cpu: "0.1"        memory: 128Mi---apiVersion: mongo.daocloud.io/v1alpha1kind: MongoBackUpmetadata:  name: mongobackup-samplespec:  backUpInstance: mongodb-test  backUpNode: mongodb-test-standalone-0-0  storages: mongo-operator-s3
  • Replica set deployment/backup/master-slave switch

apiVersion: mongo.daocloud.io/v1alpha1kind: MongoDBmetadata:  name: mongodb-testspec:  image: daocloud.io/atsctoo/mongo:3.6  type: ReplicaSet  members: 3  service: mongodb-test-svc  rootPassword: "123456"  dbUserSpec:    enable: true    name: mongo001    user: user001    password: "123456"  arbiter: false  notPersistent: false  customConfig: mongo-operator-mongo-default-config  exportConnect: true  resources:    limits:      cpu: "1"      memory: 512Mi    requests:      cpu: "1"      memory: 512Mi  storage: 1Gi  backUpStorage: 1Gi  storageClassName: ""   # 存储类型 为空时,表示使用默认  metricsExporterSpec:    enable: true    resources:      limits:        cpu: "0.1"        memory: 128Mi      requests:        cpu: "0.1"        memory: 128Mi---apiVersion: mongo.daocloud.io/v1alpha1kind: MongoBackUpmetadata:  name: mongobackup-samplespec:  backUpInstance: mongodb-test  backUpNode: mongodb-test-replset-0-0  storages: mongo-operator-s3---apiVersion: mongo.daocloud.io/v1alpha1kind: MongoChannelmetadata:  name: mongochannel-samplespec:  changeMasterTo: 'mongdb-test-replset-0-2.dce.dsp.ats.io:35966'  instance: mongodb-test
  • Shard deployment/backup

apiVersion: mongo.daocloud.io/v1alpha1kind: MongoDBmetadata:  name: mongodb-testspec:  image: daocloud.io/atsctoo/mongo:3.6  type: ShardedCluster  service: mongodb-test-svc  shardCount: 2  mongodsPerShardCount: 3  configServerCount: 3  mongosCount: 2  rootPassword: "123456"  dbUserSpec:    enable: true    name: mongo001    user: user001    password: "123456"  arbiter: true  notPersistent: true  customConfig: mongo-operator-mongo-default-config  resources:    limits:      cpu: "1"      memory: 512Mi    requests:      cpu: "1"      memory: 512Mi  storage: 1Gi  backUpStorage: 1Gi  storageClassName: ""   # 存储类型 为空时,表示使用默认  metricsExporterSpec:    enable: true    resources:      limits:        cpu: "0.1"        memory: 128Mi      requests:        cpu: "0.1"        memory: 128Mi---apiVersion: mongo.daocloud.io/v1alpha1kind: MongoBackUpmetadata:  name: mongobackup-samplespec:  backUpInstance: mongodb-test  backUpNode: mongodb-test-replset-0-0  storages: mongo-operator-s3

5.3 Monitoring Chart

06

Summarize

The above introduces the MongoDB Operator based on the K8s Operator model. The MongoDB Operator simplifies the deployment, monitoring, and operation and maintenance of MongoDB. It provides functions such as creating/updating/configuring MongoDB services and backups on the cloud according to independent needs, and improves The user's ability to manage the middleware MongoDB.


 author of this article 

Song Wenjie

"DaoCloud Daoke" Python R&D Engineer


Guess you like

Origin blog.csdn.net/DaoCloud_daoke/article/details/127675550