redis distributed and mysql distributed / cluster

 

Distributed and cluster

Current projects rarely use stand-alone architectures. One is because the stand-alone performance is limited, and the other is because once the stand-alone service fails, the entire system cannot continue to provide services. Therefore, the current cluster and distributed architectures are widely used, mainly to solve the above two problems, a performance problem and a fault problem, to solve the performance (high concurrency) problem through the distributed architecture, and to solve the fault service (high Available) question.

Distributed architecture

Distributed : A service is split into multiple sub-services and deployed on different servers

Many articles on the Internet make the distributed architecture very complicated, but none of them are critical. In fact, it is very simple to understand. All computer systems are for business. The same business is split into multiple sub-businesses, and each sub-business is deployed. On different servers, this is a distributed architecture. By disaggregating services, different servers with different performances are configured for different sub-businesses to improve the performance of the entire system. I personally think that the current hot concept of microservices is actually distributed in nature.

According to the type, it can be roughly divided into two types: distributed computing and distributed storage .

Distributed computing is well understood, that is, a large number of computing tasks are distributed to multiple computing units to improve the overall computing performance. For example, brute force cracking a password requires traversing a certain character combination 100,000 times. Assuming that one computer needs 10 minutes, then 10 computers traverse at the same time, each traversing 10,000 times, and finally summing up the results, then it only takes 1 minute. The combination of these 10 computers is a distributed computing system, and the business here is computing.

Similarly, distributed storage is also well understood, that is, a large amount of data is distributed to multiple storage units to increase the total storage capacity. For example, 100ZB data can not be put into one storage unit, then it is split into 100 copies, and each storage unit stores 1 copy. Then the 100 storage units are combined to form a distributed storage system. The business here is storage. At present, mainstream relational databases have relatively mature distributed storage solutions, such as MySQL's MySQL Fabric, MyCat, etc., Oracle Database has Oracle Sharding, etc. Redis is a popular non-relational database. Because it is an in-memory database, theoretically, it will generally not store too much data in Redis, but in some special cases, there will still be insufficient storage space, or need to prevent insufficient storage space. Situation occurs, this time you need Redis distributed architecture.

For example, there are many subsidiaries under a certain group, and each subsidiary has multiple IT systems, many of which require Redis. For the unified management, the group has built a central Redis system, which requires all subsidiaries The IT system uses the group's central Redis library uniformly. At this time, even if the current storage capacity is sufficient, in order to cope with the later development, distributed storage must be used, because the distributed architecture theoretically supports unlimited horizontal expansion.


Cluster architecture

Cluster : the same service, deployed on multiple servers

The cluster is also very easy to understand, that is, deploying the same service on multiple servers, which can serve two purposes:

  1. Disperse the pressure on each server
  2. Any one or several server downtime will not affect the entire system

Redis single node deployment

Methods as below:

# Download the Redis binary installation package: 
wget http://download.redis.io/releases/redis-5.0.4.tar.gz 
# Unzip the binary package 
tar –zxvf redis-5.0.4.tar.gz 
# Enter the unzip folder And compile the binary file 
cd redis-5.0.4 
make 
# install 
cd src 
make test 
make install

Start and check

# Go to the redis installation directory and edit redis.conf 
vim /usr/redis/redis-5.0.4/redis.conf 
# Modify daemonize no to daemonize yes (so that the redis service can run in the background) 

# Run under the specified configuration redis service 
/ usr / local / bin / redis-server /usr/redis/redis-5.0.4/redis.conf 
# View redis running status 
ps -ef | grep redis 

# output 
app 21794 1 0 Jan28? 03:31:25 ./redis-server *: 6379

 

redis high availability

Redis Sentinel is the high availability (HA) solution officially recommended by Redis, which is the most practical and commonly used solution in the production environment.

Here comes another concept: master-slaver (master-slave mode). It is easy to understand that it is a commonly used active / standby mode, such as nginx's active / standby mode. A master redis node can be configured with multiple slave nodes. When the master node hangs, the slave node automatically replaces the master node on the top. This can effectively avoid the problem of a single node halting and causing the entire system to hang, achieving high availability of redis services. .

But this solution needs to solve two basic problems:

  1. How to judge the running health of each node (especially the master node) in advance?
  2. When the master node is down, how to select one from multiple slave nodes as the new master node and realize automatic switching?

At this time, Redis Sentinel came into being. It mainly has the following three characteristics:

  • Monitoring (Monitoring ): Sentinel will constantly check whether your master server and slave server are working properly.
  • Notification : When there is a problem with a Redis server being monitored, Sentinel can send notifications to administrators or other applications through the API.
  • Automatic failover (Automatic failover) : When a master server fails to work properly, Sentinel will start an automatic failover operation, which will upgrade one of the failed master servers from the slave server to the new master server, and let the failed master server The other slave servers are changed to copy the new master server; when the client tries to connect to the failed master server, the cluster will also return the address of the new master server to the client, so that the cluster can use the new master server to replace the failed server.

Mysql distributed cluster

1. Why use a distributed database

    With the rapid development and popularization of computers and information technology, the scale of industry application systems has rapidly expanded, and the amount of data generated by industry applications has exploded, reaching hundreds of terabytes or even hundreds of petabytes, far exceeding the existing The traditional computing technology and the processing power of information systems, and centralized databases gradually show their limitations in the face of large-scale data processing. Therefore, people want to find a method that can quickly process data and respond to user access in a timely manner. The data is analyzed, managed and maintained centrally. This has become an urgent need in the real world.

    Distributed database is developed on the basis of centralized database, which is the product of the combination of computer technology and network technology. A distributed database refers to a database system where data is physically distributed and logically centralized. Physically distributed means that the data of the distributed database is distributed on nodes or sites that are different in physical location and connected by the network; logically centralized means that the databases are logically a whole and managed by a unified database management system . Different node distributions can span different computer rooms, cities, and even countries.

    The main features of the distributed database are as follows:

  1. Transparency: Users do not need to care about the details of the logical partitioning and physical location of the data, or the consistency of duplicate copies (redundant data), and they do not need to care about which data model the database supports on the local site. For system development engineers, when data is moved from one site to another, there is no need to rewrite the application, and it is used like a centralized database.
  2. Data redundancy: The distributed database implements system reliability and availability through redundancy and improves its performance. Multiple nodes store a copy of the data. When the data of a certain node is damaged, the redundant copy can ensure the integrity of the data; when the working node is damaged, it can be switched through mechanisms such as heartbeat, and the whole system is not destroyed . You can also reduce the consumption of network communications, speed up access speed, and improve performance through the principle of nearby analysis of hotspot data.
  3. Easy to expand: In a distributed database, it can easily improve the overall performance of the system through horizontal expansion, and can also provide performance through vertical expansion. The expansion does not require modification of the system program.
  4. Autonomy: The data on each node is managed by the local DBMS, with automatic processing capabilities, to complete the application of the site or local application.

2. Principle of MySQL Cluster

    MySQL cluster is a highly available and highly redundant version of MySQL suitable for distributed computing environments. It uses the NDB Cluster storage engine, which allows multiple MySQL servers to run in a cluster. The storage engine is available in binary versions of MySQL 5.0 and above, and in RPM packages compatible with the latest Linux versions.

    MySQL clustering is a technology that allows the deployment of "in-memory" and "on-disk" database clusters in shared-nothing systems. With a shared-nothing architecture, the system can use inexpensive hardware and has no special requirements for hardware and software. In addition, since each component has its own memory and disk, there is no single point of failure. MySQL Cluster consists of a group of computers, each of which runs a variety of processes, including MySQL server, NDB Cluster data node, management server, and (possibly existing) special data access program.

The management server (MGM node) is responsible for managing the Cluster configuration files and Cluster logs. Each node in the cluster retrieves configuration data from the management server.

When a new event occurs in the data node, the node transmits information about such events to the management server, and then writes such information into the Cluster log.

Currently, the operating systems that can run MySQL Cluster are Linux, Mac OS X, and Solaris. The latest version already supports the Windows operating system.

The communication between the data nodes of the MySQL cluster is not encrypted and requires high-speed bandwidth, so it is recommended to build the cluster in a high-speed local area network, and it is not recommended to deploy this system system across network segments or across public networks.

 

MySQL cluster is divided into three types of nodes: management node, data node and SQL node.

Management node: It is mainly used to manage each node, and can restart, shut down, and start a certain node through commands. It can also monitor the working status of all nodes.

Data node: It mainly stores data and does not provide other services.

SQL node: It mainly provides SQL functions externally, similar to an ordinary MySQL Server.

The SQL node and the data node can be the same machine, which means that this machine is both a SQL node and a data node. They are just logical divisions. In actual deployment, even all stages can be located on the same physical machine, but the configuration is more complicated.

 

3. The advantages and disadvantages of MySQL Cluster

advantage:

  1. 99.999% high availability
  2. Fast automatic failover
  3. Flexible distributed architecture, no single point of failure
  4. High throughput and low latency
  5. Strong scalability, support online expansion

Disadvantages:

  1. There are many restrictions, such as: foreign keys are not supported, and the data row cannot exceed 8K (excluding data in BLOB and text)
  2. Complex deployment, management, and configuration
  3. Take up a lot of disk space and memory
  4. Inconvenient backup and restore
  5. When restarting, it takes a long time for the data node to load the data into memory

 

4. Domestic application of MySQL Cluster

China Academy of Electronic Science uses MySQL Cluster to build a highly available test system-MySQL Community-Powered by Discuz!

 

References

https://www.jianshu.com/p/21110d3130bc

https://www.cnblogs.com/behindman/p/8749599.html

 

 
 

Cluster: the same service, deployed on multiple servers

Guess you like

Origin www.cnblogs.com/double-W/p/12723951.html