[MongoDB Advanced] Several solutions for MongoDB high-availability cluster configuration

1. Solutions for High Availability Clusters

High availability or HA (High Availability) refers to improving the availability of systems and applications by minimizing downtime caused by routine maintenance operations (planned) and sudden system crashes (unplanned).

The high availability of computer systems has different manifestations at different levels:

(1) High availability of the network

Due to the rapid development of network storage, network redundancy technology has been continuously improved. The key application to improve the high availability of IT systems is network high availability. There is a difference between network high availability and network high reliability. Network high availability is achieved by matching redundancy. The network equipment realizes the redundancy of network equipment and achieves the purpose of high availability.
Such as redundant switches, redundant routers, etc.

(2) High availability of servers

Server high availability is mainly implemented using server cluster software or high availability software.

(3) High availability of storage

High availability of storage using software or hardware technologies. Its main technical indicators are storage switching function, data replication function, data snapshot function, etc. When one storage fails, the other backup storage can be quickly switched to achieve the purpose of non-stop of one storage.

 

2. MongoDB's high-availability cluster configuration

High availability cluster, namely High Availability Cluster, HA Cluster for short. A cluster is a group of computers that, as a whole, provide users with a set of network resources. These individual computer systems are the nodes of the cluster. Building a high-availability cluster requires reasonable configuration of roles between multiple computers, data recovery, consistency, etc. There are mainly the following methods:

(1) Master-slave mode (asymmetric mode)

The main machine is working, and the standby machine is in a monitoring and ready state; when the main machine is down, the standby machine takes over all the work of the main machine. The consistency is solved by a shared storage system.

(2) Duplex mode (mutual backup and mutual assistance)

Two hosts run their own service work at the same time and monitor each other's situation. When either host is down, the other host immediately takes over all its work to ensure real-time work. The key data of the application service system is stored in the shared storage system .

(3) Cluster working mode (multi-server mutual backup mode)

Multiple hosts work together, each runs one or several services, and each defines one or more backup hosts for the services. When a host fails, the services running on it can be taken over by other hosts.

 

The practice of MongoDB cluster configuration also follows these schemes, mainly including master-slave structure, replica set method and sharding sharding method.

 

3. Master-Slave master-slave structure

The master-slave architecture is generally used for backup or read-write separation. Generally, there is a master-slave design and a master-multiple-slave design.

Consists of two roles:

(1) Master

Readable and writable, when the data is modified, the oplog will be synchronized to all connected salves.

(2) From (Slave)

Read-only but not writable, automatically synchronize data from the Master.

In particular, for Mongodb, it is not recommended to use the Master-Slave architecture, because the Master-Slave cannot automatically recover after the Master is down. It is recommended to use the Replica Set, which will be introduced later, unless the number of Replica nodes exceeds 50, it is required With the Master-Slave architecture, it is impossible to use that many nodes under normal circumstances.

Another point is that Master-Slave does not support chain structure, and Slave can only directly connect to Master. The Master-Slave of Redis supports a chain structure, and the Slave can connect to the Slave and become the Slave of the Slave.

 

4. Relica Set replica set method

Mongodb's Replica Set, that is, the replica set method, has two main purposes. One is to use data redundancy for failure recovery. When a hardware failure or downtime occurs due to other reasons, the replica can be used for recovery.

The other is to do read-write separation, and the read request is distributed to the replica to reduce the read pressure on the primary (Primary).

1. Replica Set built by Primary and Secondary

 

A Replica Set is a collection of mongod instances that have the same data content. Contains three types of roles:

(1) Primary node (Primary)

Receives all write requests and synchronizes changes to all Secondary. A Replica Set can only have one Primary node. When the Primary fails, other Secondary or Arbiter nodes will re-elect a primary node. The default read request is also sent to the Primary node for processing. It needs to be forwarded to the Secondary node and the client needs to modify the connection configuration.

(2) Replica node (Secondary)

Keep the same dataset as the master node. When the master node hangs up, participate in the election of the master.

(3) Arbiter

It does not keep data, does not participate in the election of the leader, and only conducts the election of the leader. Using Arbiter can reduce the hardware requirements for data storage. Arbiter has almost no major hardware resource requirements to run, but it is important that it and other data nodes should not be deployed on the same machine in a production environment.

Note that the number of Replica Set nodes for an automatic failover must be an odd number. The purpose is to have a majority when voting for the leader to make the leader election decision.

(4) main selection process

Among them, the Secondary is down and will not be affected. If the Primary is down, the primary will be re-selected:

2. Use Arbiter to build a Replica Set

 An even number of data nodes, plus an Arbiter to form a Replica Set:

 

>>Sharding sharding technology

When the amount of data is relatively large, we need to shard the data to run on different machines to reduce the pressure on CPU, memory and IO. Sharding is a database sharding technology.

MongoDB sharding technology is similar to MySQL's horizontal and vertical sharding. The database is sharded in two ways: vertical expansion and horizontal sharding.

The way to scale vertically is to expand the cluster, adding more CPU, memory, disk space, etc.
 

Horizontal segmentation is to provide services uniformly through the cluster by means of data sharding:

(1) Sharding architecture of MongoDB

(2) Roles in MongoDB sharding architecture

A. Data sharding (Shards)

Used to save data to ensure high availability and consistency of data. Can be a single mongodinstance or a replica set.

In a production environment, a Shard is generally a Replica Set to prevent a single point of failure of the data slice. There is a PrimaryShard in all shards, which contains undivided data sets:


B. Query Routers

Routing is an instance of mongos. The client directly connects to mongos, and mongos routes read and write requests to the specified shard.

A Sharding cluster can have one mongos or multiple mongos to reduce the pressure of client requests.


C. Config servers

Save the metadata of the cluster, including the routing rules of each shard.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325349133&siteId=291194637