MySQL cluster solution and implementation design

I. Overview

There are many MySQL cluster solutions, which can be divided into MySQL official and third-party according to the way of implementation.
1) The official implementation methods include: master-slave mode, one-master multi-slave mode, cluster cluster mode, etc.
2) The third-party implementation methods include: MMM (dual master and multi-slave) mode, MHA (multi-master and multi-slave) mode, Galera Cluster (multi-master structure) mode, etc.


Two, MySQL official implementation

1. Master-slave mode-MySQL Replication
Insert picture description here


Principle
Write in master and read in slave.
Asynchronous replication of main database data is achieved by replaying binlog. That is, when the main library executes an sql command, it will be executed again in the slave library to achieve
the effect of master-slave replication . In this process, the master's data write operations are recorded in the binary log file (binlog), and a log dump thread is generated to transfer the binlog to the i/o thread of the slave library. The i/o thread of the slave library requests the binlog of the main library and writes the obtained binlog log to the relay log (relaylog). The sql thread of the slave library reads the log in the relaylog file and parses it into specific Operation, through the master-slave operation consistency, and achieve the final data consistency.


Purpose To
achieve multi-point backup of data (without failover and load balancing).


Advantages
Compared to a single MySQL, the advantages are:

  • Writing data from the master and reading data from the slave can play a role in separation of reads and writes, and multiple slave databases can do load balancing.
  • You can temporarily interrupt the replication process in a database to back up data, so as not to affect the external service of the master data (if you perform a backup on the master, you need to make the master in the readonly state, which means that the write request needs to be blocked).

Compared with each cluster solution, the advantages are:

  • The master-slave replication comes with MySQL, without the need for a third party.
  • The data is deleted and can be recovered from binlog.
  • The configuration is relatively simple and convenient.

...
disadvantages

  • The slave library needs to obtain data from the binlog and replay it. There is a time delay between writing data to the master library, and the data of the slave library always lags behind the master library.
  • The network delay requirements between the master library and the slave library are relatively high. If the network delay is too large, the first point lag will be increased, resulting in inconsistent final data.
  • If a single master node is down, it will not be able to provide external write services.


Application scenarios In
this way, in daily business, read operations are more than write operations.


Building and implementation
Install MySQL. After the installation is complete, you need to configure the configuration file.
Insert picture description here

1) Configure the host

a) Open log-bin and server-id=101 in /etc/mysql/mysql.conf.d/mysqld.cnf.

$sudo vim /etc/mysql/my.cnf //不同版本,配置文件位置可能不一样

Insert picture description here
Add server-id,
modify log-bin and enable: bind-address=0.0.0.0

b) Then start MySQL: sudo /etc/init.d/mysql restart
c) After entering MySQL, create a Replication to synchronize data with the slave.

mysql > create user replication@'%' identified by '123456';
mysql > grant all privileges on *.* to 'replication'@'%' identified by '123456' with grant option;
mysql > show master status;

Insert picture description here
The mysql-bin.000001 file here is used to synchronize data, and Position is the position offset.

2) Configure the slave

The configuration data of the slave must correspond to the configuration of the master.
a) For the slave, log-bin does not need to be turned on, but server-id=100 must be turned on.
b) Then start MySQL: sudo /etc/init.d/mysql restart
c) Set master_ip, binlog, pos, master_port, master_use, master_password.

mysql > change master to master_host='192.168.189.133', master_port=3306, master_user='replication', master_password='123456', master_log_file='mysql-bin.000001', master_log_pos=380;

d) Start the slave: mysql > start slave
e) View the status of the slave:mysql> show slave status\G

If you see Running as Yes in the figure below, it means that the master and slave have started to synchronize work.
Insert picture description here

2. One-master, multiple-slave-MySQL Fabirc
question leads: One-master multiple-slave mode, what should I do if the host has problems?
Insert picture description here

Principle It is
still the structure of one master and multiple slaves. On the basis of MySQL Replication, fault detection and transfer, and automatic data fragmentation functions are added.
MySQL Fabirc has only one master node. The difference is that when the master node goes down, one of the slave nodes will be selected as the master node.
Insert picture description here


Advantages
Compared with each cluster solution, the advantages are:

  • The master-slave replication comes with MySQL, without the need for a third party.
  • The data is deleted and can be recovered from binlog.
  • After the master node hangs up, it can automatically select one of the slave nodes to be the master node, without affecting the continuous provision of external write services.

...
disadvantages

  • The slave library needs to obtain data from the binlog and replay it. There is a time delay between writing data to the master library, and the data of the slave library always lags behind the master library.
  • The network delay requirements between the master library and the slave library are relatively high. If the network delay is too large, the first point lag will be increased, resulting in inconsistent final data.
  • The cluster method is a product launched in May 2014, with relatively low database qualifications, few application cases, and relatively few online materials.
  • Transactions and queries are only supported in the same shard, the data updated in the transaction cannot cross shards, and the data returned by the query statement cannot cross shards.
  • Node failure recovery is 30 seconds or more (this is the case for InnoDB storage engines).

3. Cluster cluster-MySQL Cluster
problem leads to: the host is down, what should I do if some data is not transmitted to the slave?


Principle
This method is a multi-master and multi-slave structure. MySQL is divided into 2 layers. The first layer is the mysql shell (each corresponding IP address), which includes network, SQL statement analysis, optimization, etc. The next layer is the storage engine.
Insert picture description here
Cluster has an extra layer of NDB storage engine.
There may be multiple machines in each cluster, and each node is an NDB node, and synchronization between them is equivalent to managing a piece of data for each segment, and there is no so-called master-slave.
Insert picture description here
Insert picture description here


Advantages
Compared with each cluster solution, the advantages are:

  • The master-slave replication comes with MySQL, without the need for a third party.
  • Excellent high availability, up to 99.999% availability, can automatically split data, can redundant data across nodes (the data set is not stored on a specific MySQL instance, but is distributed in multiple Data Nodes, namely The data of a table may be scattered on multiple physical nodes, and any data will be redundantly backed up on multiple Data Nodes. Any data change operation will be synchronized on a group of Data Nodes to ensure data consistency ).
  • Excellent scalability, can automatically segment data, and facilitate the horizontal expansion of the database.
  • Excellent load balancing, can automatically segment data, and facilitate the level of database expansion.
  • With multiple master nodes, there is no single point of failure, and node failure recovery is usually less than 1 second.

...
disadvantages

  • The architectural patterns and principles are complex.
  • Only the storage engine NDB can be used, which has many obvious gaps with the usual InnoDB. For example, in transactions (the transaction isolation level only supports Read Committed, that is, before a transaction is committed, the changes made in the transaction cannot be queried), foreign keys (although the latest NDB storage engine already supports foreign keys, but performance problems , Because the records associated with the foreign key may be in other shard nodes), the difference in table restrictions may cause surprises in daily development.
  • As a distributed database system, there is a large amount of data communication between various nodes. For example, all accesses need to pass through more than one node (at least one SQL Node and one NDB Node) to complete, so the internal interconnection network between nodes High bandwidth requirements.
  • Data Node data will be placed in the memory as much as possible, which requires a large amount of memory, and when restarting, it takes a long time for the data node to load the data into the memory.

Three, third-party implementation

1. MMM-Dual-Master-Multi-Slave Mode
MMM (Master Replication Manager for MySQL) is a dual-master, multi-slave structure. MMM is optimized on the basis of MySQL Replication. This is an open source project of Google. It uses Perl to extend MySQL Replication and provides a script program that supports dual-master failover and dual-master daily management. It is mainly used to monitor MySQL master-master replication and failover.
Insert picture description here
MMM (Master-Master replication manager for Mysql) is a flexible script program for monitoring and failover, and managing the configuration of mysql Master-Master replication (only one node can be written at a time).
Two hosts share a virtual IP. When interacting with MySQL externally, they only need to interact with the virtual IP. It is uncertain which master to interact with internally.
Insert picture description here

The main functions of MMM are realized through three scripts:
1) mmm_mond
monitoring process, responsible for all monitoring work, determining and processing all node role activities.

2) mmm_agentd
is an agent process running on each mysql server to complete monitoring probe work and perform simple remote service settings.

3) mmm_control is
a simple script that provides commands to manage the mmm_mond process.

Note: Although the dual-master node here is called dual-master replication, only one master is allowed to write at the same time in the business, and the other alternate master provides partial read services to speed up the alternate master at the time of master-master switching. Of preheating.


Advantages
Compared with each cluster solution, the advantages are:

  • Automatic main and main Failover switch, generally switch to standby machine within 3s.
  • Load balancing of multiple slave nodes.

...
disadvantages

  • Data consistency cannot be fully guaranteed. If master 1 hangs up, the MMM monitor has switched to master 2, and if the data of master 2 lags behind master 1 during dual-master replication at this time (that is, it has not been completely copied), then master 2 has become the master at this time. Nodes provide external write services, which results in data inconsistency.
  • Because it uses virtual IP floating technology, similar to Keepalived, RIP (real IP) and VIP (virtual IP) must be in the same network segment. If it is in a different network segment, it is also possible, and virtual routing technology is required. However, it is absolutely necessary to be in the same IDC computer room, and cannot form a cluster across IDC computer rooms.

2. MHA
Insert picture description here

MHA (Master High Availability) is a multi-master and multi-slave structure. MHA is optimized on the basis of MySQL Replication. This is developed by youshimaton of Japanese DeNA Company. It mainly provides more master nodes, but lacks VIP (virtual IP) and needs to be used with keepalived and so on.

To build MHA, a replication cluster must have at least 3 database servers, one master and two slaves, that is, one serves as the master, one serves as the master, and the other serves as the slave database.


Advantages
Compared with each cluster solution, the advantages are:

  • Can carry out automatic detection and transfer of faults.
  • With automatic data compensation capability, data consistency can be guaranteed to the greatest extent when the main library crashes abnormally.

...
disadvantages

  • The MHA architecture realizes the separation of read and write. The best practice is to plan the separation of read and write in advance during application development and design. Set up two connection pools when using it, namely read connection pool and write connection pool. You can also choose a compromise solution that is to introduce SQL Proxy. But the code needs to be changed anyway.
  • For read load balancing, you can use tools such as F5, LVS, HAPROXY, or SQL Proxy. As long as you can achieve load balancing, fault checking, and read-write stripping after the backup is upgraded, LVS is recommended.

3. Galera Cluster
Insert picture description here
Galera Cluster is a MySQL multi-master structure cluster developed by Codership. These master nodes are the slave nodes of other nodes. Different from MySQL's native master-slave asynchronous replication, Galera uses multi-master synchronous replication, and optimizes for transaction conflicts and deadlocks that may occur during synchronous replication. That is, the replication is not based on the official binlog but the Galera replication plugin. , Rewrite the wsrep api. In asynchronous replication, the master database immediately commits the transaction after propagating data updates to the slave database, regardless of whether the slave database successfully reads or replays the data changes. In this case, within a short period of time after the main library transaction is committed, the data of the master and slave libraries are not consistent. In synchronous replication, a single update transaction of the master database needs to be updated synchronously on all slave databases. In other words, when the main library commits the transaction, the data of all nodes in the cluster remains consistent.

For read operations, the data read from each node is the same. For write operations, when data is written to a node, the cluster will synchronize it to other nodes.

Insert picture description here

Advantages
Compared with each cluster solution, the advantages are:

  • Under multi-master and multi-active, read and write operations can be performed on any node. Even if a node is down, it will not affect the read and write of other nodes. There is no need to perform failover operations, and it will not interrupt the external provision of the entire cluster. service. .
  • Excellent scalability, new nodes will automatically pull the data of online nodes (when a new node joins, the cluster will select a Donor Node to provide data for the new node), and finally all nodes in the cluster have the same data without manual backup and recovery .

...
disadvantages

  • The ability to achieve strong data consistency is undoubtedly at the expense of performance.

Fourth, MySQL synchronizes to Redis

  1. Synchronous mode 1
    business realization, first write data to MySQL, and then write data to redis.
	func()
	{
    
    
		write_to_mysql();
		write_to_redis();
	}
  1. Synchronization mode 2
    only writes to MySQL, and MySQL synchronizes data to redis.
    Use MySQL UDF: Set triggers on the table, when the table adds, deletes, or changes data, a callback is triggered, and the UDF user-defined function is called back. There are not many restrictions and numbers, and performance is not good.
    Insert picture description here
    Insert picture description here

  2. Synchronization method 3
    MySQL synchronizes through binlog (the same as the master-slave synchronization method).
    Make a middleware, which is equivalent to a slave, and then write the data to redis.
    Component composition: mysqlslave, redisclient
    Insert picture description here

Guess you like

Origin blog.csdn.net/locahuang/article/details/110395476