Redis cluster (Cluster and Codis)

1. Why use a cluster?

        When we encounter a large amount of data storage situation, there are two solutions:

        Vertical expansion: upgrade the configuration of a single redis instance, including increasing memory capacity, disk capacity, and using higher cpu configuration.

        Vertical expansion: simple and straightforward. There are also some problems: when using RDB to persist data, if the amount of data increases, the required memory will also increase, and the main thread fork child process may be blocked; the second problem is the limitation of hardware and cost.

        Horizontal expansion: horizontally increase the number of current redis instances, and reduce data allocation to multiple instances;

        In reality, when faced with tens of millions or even hundreds of millions of traffic, many large manufacturers perform traffic scheduling and service management on clusters composed of thousands of instance nodes. Therefore, using the cluster mode is widely used in the industry.

Let's briefly talk about horizontal expansion, that is, Redis's Cluster cluster mode and Codis.

When we use cluster mode, we first think of two questions:

        1. How data is distributed among multiple instances

        2. How does the client determine which instance the data to be accessed is on?

        3. When the hash slot is reallocated, how to deal with the request

2.Redis Cluster

        2.1 Data distribution among multiple instances

         Simply put, Redis Cluster uses Hash Slot (slot for short) to handle the mapping relationship between data and instances. In the Redis Cluster solution, a cluster has a total of 16384 (why is this number) slots, and each key-value pair will be mapped to a slot according to its key. Each key calculates a 16-bit value through the CRC16 algorithm and then modulo 16384 to determine which slot to put in. Each node is responsible for maintaining a part of the slot and the key-value data mapped by the slot.

        Formula: slot = CRC16(key) & 16383

        Why use the hash slot concept? (Do not use consistent hash) Reference: Analysis of consistent hash and hash slot_hash slot and consistent hash_Blue Eucalyptus Unrelated Blog-CSDN Blog

        It is easy to add or remove nodes:

        For example, if I want to add a new node D, I need to transfer some slots from nodes A, B, and C to D. If I want to remove node A, I need to move the slots in A to nodes B and C, and then remove the A node without any slots from the cluster.

        Since moving a hash slot from one node to another does not stop the service, adding, deleting or changing the number of hash slots of a node will not cause the cluster to be unavailable.

        2.2 Client locates data nodes

        Now that we know how the data is stored in the hash slot, how does the client know which instance the data to be accessed is on?

        When locating key-value pair data, the slot in which it is located can be obtained through calculation. This calculation can be performed when the client sends a request, but it is also necessary to know which instance the slot is on.

        When the client establishes a connection with the cluster, the instance will send the slot allocation information to the client, and the redis instance will synchronize its own slot information to other connected instances to complete the diffusion of the slot allocation information, so each instance will know all the slot allocation information.

        After the client receives the information, it will cache the hash slot information locally. When the client requests a key-value pair, it will first calculate the hash slot corresponding to the key, and then send the request to the corresponding instance.

        2.3 Data Migration

        The correspondence between instances and hash slots in a cluster is not static, and there are two common changes:

        1. To add or delete an instance, Redis needs to reassign the hash slot

        2. For load balancing, redis needs to redistribute the hash slot

        Instances can exchange messages with each other to obtain the latest hash slot allocation information, but the client cannot obtain this information in real time, which will lead to inconsistencies between the cached allocation information and the latest ones. When this happens, what will be done?

        Redis Cluster provides a redirection mechanism , which means that when the client sends a read and write operation to an instance, there is no corresponding data on the instance, and the client has to send it to the new instance again.

        There are two redirection mechanisms: moved redirection and ask redirection.

        moved redirection:

  • 1. Each node will share the relationship between the slot in Redis Cluster and the corresponding node in the cluster through communication.
  • 2. The client sends a command to any node of Redis Cluster, and the node receiving the command will perform hash operation and 16383 remainder according to the CRC16 rule, and calculate its own slot and corresponding node.
  • 3. If the slot for storing data is assigned to the current node, execute the command in the slot and return the command execution result to the client.
  • 4. If the slot where the data is stored is not within the management range of the current node, a moved redirection exception will be returned to the client.
  • 5. The client receives the result returned by the node, and if it is a moved exception, obtains the information of the target node from the moved exception.
  • 6. The client sends a command to the target node to obtain the command execution result.

        ask redirection:

        When expanding or shrinking the cluster, the slot and the data in the slot need to be migrated. When the slot and the data in the slot are being migrated, when the customer service end requests the target node, the slot in the target node has been migrated to a branch node, and the target node will return an ask to the client.

  • 1. When the client sends a command to a node in the cluster, the node returns a moved exception to the client, telling the client the node information corresponding to the target slot.
  • 2. The client sends a command to the target node. The slots in the target node have been migrated to other nodes. At this time, the target node will return an ask to redirect to the client.
  • 2. The client sends an Asking command to the new target node, and then sends a request request command to the new node again.
  • 3. The new node target executes the command and returns the command execution result to the client.

The difference between the two: moved migrated, ask: migrating

think:

16384 (why this number)? Author's answer: https://github.com/antirez/redis/issues/2576

1. If the slot is 65536, the header of the heartbeat message sent is 8k, and the heartbeat packet sent is too large.

        In the message header, the most space-consuming is myslots[CLUSTER_SLOTS/8]. When the slot is 65536, the size of this block is: 65536÷8=8kb because every second, the redis node needs to send a certain number of ping messages as heartbeat packets. If the slot is 65536, the header of this ping message is too large, wasting bandwidth.

2. The number of redis cluster master nodes is basically impossible to exceed 1000.

        As mentioned above, the more cluster nodes there are, the more data is carried in the message body of the heartbeat packet. If there are more than 1000 nodes, it will also cause network congestion. Therefore, the author of redis does not recommend that the number of redis cluster nodes exceed 1000. Then, for redis clusters with less than 1000 nodes, 16384 slots are enough. There is no need to expand to 65536.

3. The smaller the slot and the fewer nodes, the higher the compression rate.

        In the configuration information of the Redis master node, the hash slot it is responsible for is saved in the form of a bitmap. During the transmission process, the bitmap will be compressed, but if the fill rate slots / N of the bitmap is high (N indicates the number of nodes), the compression rate of the bitmap is very low. If the number of nodes is small and the number of hash slots is large, the compression rate of the bitmap is very low. And 16384÷8=2kb, how about it, amazing!

3. Codes

        3.1 Overall Architecture and Basic Process

  • Codis Proxy (codis-proxy): Accept client requests and forward them to codis-server.

        The Redis proxy service connected by the client, codis-proxy itself implements the Redis protocol, and behaves no differently from a native Redis (just like  Twemproxy ). For a business, multiple codis-proxy can be deployed, and codis-proxy itself is stateless.

  • Codis Manager (codis-config): Codis management tool, support includes adding/deleting Redis nodes, adding/deleting Proxy nodes, initiating data migration and other operations. codis-config itself also comes with an http server, which will start a dashboard, and users can directly observe the running status of the Codis cluster on the browser.

  • Codis Redis (codis-server): A redis instance that has undergone secondary development, which adds additional data structures to support data migration operations, and is mainly responsible for processing specific data read and write requests.

  • ZooKeeper: stores the data routing table and meta information of codis-proxy nodes, and the commands initiated by codis-config will be synchronized to each surviving codis-proxy through ZooKeeper.

processing flow

 3.2 Allocation among multiple instances of data

        In the Codis cluster, which instance a data is saved on is also done through logical slots.

        1. There are a total of 1024 Slots in the Codis cluster, which can be allocated manually or automatically allocated by the Codis dashboard

        2. How to determine which slot the key is placed on? When the client reads and writes data, it will use the CRC32 algorithm to calculate the hash value of the key, and then measure 1024, corresponding to the number of the slot, to know which capacity is assigned to it.

        We make the mapping relationship between Slot and server into a data routing table. After the routing table is allocated, the routing table will be sent to Codis proxy, and codis proxy will cache the routing table locally. When the client receives the request, it will directly query the local routing table to complete the correct forwarding; at the same time, the routing table will also be saved in zookeeper.

        At the same time, the data routing table will also be communicated and transmitted in each instance, and a copy will be saved in each instance. When the data routing changes, it needs to be transmitted among all instances through network messages. If there are many instances, more cluster network resources will be consumed.

 3.3 Capacity expansion and data migration (to be supplemented)

 Welcome to visit: http://mumuxi.chat/

http://mumuxi.chat/articles/149  (resource sharing #ai some free GPT 4)

Guess you like

Origin blog.csdn.net/zz18532164242/article/details/131747259
Recommended