One article to understand the evolution of Redis architecture

Nowadays, Redis is becoming more and more popular, and it is used in almost many projects. When you use Redis, have you ever thought about how Redis provides services in a stable and high-performance manner?

  • The scenario where I use Redis is very simple. Is there any problem if I only use the stand-alone version of Redis?

  • What should I do if my Redis crashes and my data is lost? How can I ensure that my business applications will not be affected?

  • Why do you need a master-slave cluster? What are its advantages?

  • What is a sharded cluster? Do I really need a sharded cluster?

  • ...

If you already have some knowledge of Redis, you must have heard of the concepts of " data persistence, master-slave replication, sentinels, and shard clusters ". What are the differences and connections between them?

If you have such doubts, in this article, I will take you from 0 to 1, and then from 1 to N, and take you step by step to build a stable and high-performance Redis cluster.

In this process, you can learn what optimization solutions Redis has adopted to achieve stability and high performance, and why?

Once you have mastered these principles, you will be able to do "with ease" when using Redis.

Start with the simplest: stand-alone Redis

First, we start with the simplest scenario.

Suppose you have a business application and need to introduce Redis to improve the performance of the application. At this time, you can choose to deploy a stand-alone version of Redis, like this:

This architecture is very simple. Your business application can use Redis as a cache, query data from MySQL, and then write it into Redis, and then the business application reads the data from Redis, because Redis data is stored in memory , so this speed is fast.

If your business volume is not large, then such an architectural model can basically meet your needs. Is not it simple?

As time goes by, your business volume gradually develops, and more and more data is stored in Redis. At this time, your business applications rely more and more on Redis.

Suddenly one day, your Redis goes down for some reason. At this time, all your business traffic will hit the backend MySQL, and the pressure on MySQL will increase sharply. In severe cases, it will even overwhelm MySQL.

What should you do at this time?

I guess your solution must be to restart Redis quickly so that it can continue to provide services.

However, because the data in Redis was in memory before, even if you restart Redis now, the previous data will be lost (assuming that persistence is not enabled). Although Redis can work normally after restarting, because there is no data in Redis, the business traffic will still hit the back-end MySQL, and the pressure on MySQL is still great.

Is there any good way to solve this problem?

Since Redis only stores data in memory, can it also write a copy of this data to disk?

If this method is used, when Redis restarts, we quickly " restore " the data in the disk to memory, so that it can continue to provide services normally.

Yes, this is a good solution. The process of writing memory data to disk is "data persistence".

Data Persistence: Be Prepared

Now, your envisioned Redis data persistence looks like this:

However, what should be done specifically for data persistence?

I guess the easiest solution you can think of is that every time Redis performs a write operation, in addition to writing memory, it also writes a copy to disk, like this:

Yes, this is the simplest and most direct solution.

But if you think about it carefully, there is a problem with this solution: every write operation of the client needs to write both memory and disk, and the time-consuming of writing disk is definitely much slower than writing memory! This will inevitably affect the performance of Redis.

How to circumvent this problem?

At this time we need to analyze the details of writing to disk.

We all know that there are actually two steps to write memory data to disk:

  1. PageCache (write) for programs to write files

  2. Flush PageCache to disk (fsync)

Specifically, it looks like this:

The roughest idea for data persistence is as mentioned above. After writing to the Redis memory, write to the PageCache + fsync disk synchronously. Of course, this must be because the disk slows down the entire writing speed.

How to optimize? It is also very simple, we can do this: Redis writes the memory by the main thread, and returns the result to the client after the memory is written, and then Redis uses "another thread" to write to the disk, so that the main thread can avoid writing to the disk. performance impact.

This persistence scheme is actually the Redis AOF (Append Only File) we often hear.

Redis AOF persistence provides three brushing mechanisms:

  1. appendfsync always: The main thread synchronizes fsync

  2. appendfsync no: by OS fsync

  3. appendfsync everysec: background thread fsync every 1 second

After solving the real-time persistence of data, we will face another problem. Data is written to AOF in real time. As time goes by, the AOF file will become larger and larger, so it will become very slow when using AOF to restore. What should I do?

Redis considerately provides the AOF rewrite solution, commonly known as AOF "slimming", as the name implies, it is to compress the volume of AOF.

Because AOF records every write command, such as executing set k1 v1, set k1 v2, in fact, we only care about the final version v2 of the data. AOF rewrite takes advantage of this feature. When the volume of AOF becomes larger and larger (over the set threshold), Redis will periodically rewrite a new AOF. This new AOF only records the final version of the data.

This compresses the AOF volume.

In addition, we can change the angle and think about other ways to persist data?

At this time, you have to consider the usage scenarios of Redis.

Recall, when we use Redis, what scenario do we usually use it for?

Yes, cache.

Using Redis as a cache means that although the full amount of data is not stored in Redis, for data that is not in the cache, our business applications can still get results by querying the back-end database, but the speed of querying the back-end data will be slower. , but it has no effect on business results.

Based on this feature, our Redis data persistence can also be done in the way of " data snapshot ".

So what is a data snapshot?

Simply put, you can understand it like this:

  1. You imagine Redis as a water cup, writing data to Redis is equivalent to pouring water into this cup

  2. At this time, you take a photo of the water cup with a camera. At the moment of taking the photo, the water capacity in the water cup is recorded in the photo, which is the data snapshot of the water cup.

That is to say, the data snapshot of Redis is to record the data in Redis at a certain moment, and then only need to write this data snapshot to the disk.

Its advantage is that it only writes data to the disk " once " when persistence is required, and does not need to operate the disk at other times.

Based on this solution, we can " regularly " take data snapshots for Redis and persist the data to disk.

This solution is the Redis RDB we often hear. RDB uses " scheduled snapshots " for data persistence. Its advantages are:

  1. Persistence files are small (binary + compressed)

  2. Low disk writing frequency (scheduled writing)

The disadvantage is also obvious, because it is timed persistence, the data is definitely not as complete as AOF real-time persistence, if your Redis is only used as a cache and is not sensitive to lost data (query from the back-end database), then this persistence method is very suitable.

If you are asked to choose a persistence solution, you can choose like this:

  1. If the business is not sensitive to data loss, choose RDB

  2. The business has relatively high requirements for data integrity, choose AOF

After understanding RDB and AOF, let's think about it further. Is there any way to not only ensure data integrity, but also make persistent files smaller and restore faster?

Let’s review the characteristics of RDB and AOF we mentioned earlier:

  1. RDB is stored in binary + data compression mode, and the file size is small

  2. AOF records every write command, the most complete data

Can we take advantage of their respective strengths?

Of course, this is the " hybrid persistence " of Redis.

If you want to have higher data integrity, you must not only use RDB, but focus on AOF optimization.

Specifically, when AOF is doing rewrite, Redis first writes a data snapshot in the AOF file in RDB format, and then appends each write command generated during this period to the AOF file.

Because the RDB is written in binary compression, the size of the AOF file becomes smaller.

Because the AOF volume is further compressed, when you use AOF to restore data, the recovery time will be shorter!

Redis version 4.0 and above only supports hybrid persistence.

Note: Hybrid persistence is an optimization of AOF rewrite, which means that it must be based on AOF + AOF rewrite.

With such an optimization, your Redis no longer has to worry about instance downtime. When a downtime occurs, you can use persistent files to quickly restore the data in Redis.

But is that okay?

Think about it carefully. Although we have optimized the persistent files to the minimum, it still takes time to restore the data . During this period, your business application cannot provide services. What should we do?

If an instance goes down, it can only be solved by restoring data. Can we deploy multiple Redis instances, and then keep the data of these instances synchronized in real time, so that when one instance goes down, we can choose one of the remaining instances to continue Just provide the service.

That's right, this solution is the "master-slave replication: multiple copies" to be discussed next.

Master-slave replication: multiple copies

You can deploy multiple Redis instances, and the architectural model becomes this:

Here we call the node that reads and writes in real time master, and the other node that synchronizes data in real time is called slave.

The advantages of adopting the multi-copy scheme are:

  1. Shorten the unavailable time : if the master is down, we can manually promote the slave to the master to continue to provide services

  2. Improve read performance : Let the slave share a part of the read request to improve the overall performance of the application

This solution is good, not only saves data recovery time, but also improves performance.

But its problem is: when the master is down, we need to "manually" promote the slave to the master, and this process also takes time.

While much faster than restoring data, it still requires human intervention. Once manual intervention is required, human reaction time and operation time must be counted, so your business applications will still be affected during this period.

Can we automate this switching process?

Sentinel: Failover

If you want to switch automatically, you must not rely on people.

Now, we can introduce an "observer" to monitor the health status of the master in real time. This observer is the "sentinel".

How to do it?

  1. The sentinel asks the master if it is normal at intervals

  2. The master replies normally, indicating that the status is normal, and the reply timeout indicates abnormality

  3. The sentinel finds an exception and initiates a master-slave switch

With this solution, there is no need for humans to intervene in the process, and everything becomes automatic, isn't it great?

But there is another problem here. If the master status is normal, but the sentinel has a problem with the network between them when asking the master, then the sentinel may " misjudgment ".

How to solve this problem?

Since one sentinel will misjudge, we can deploy multiple sentries, distribute them on different machines, and let them monitor the status of the master together. The process becomes like this:

  1. Multiple sentries ask the master if it is normal at intervals

  2. The master replies normally, indicating that the status is normal, and the reply timeout indicates abnormality

  3. Once a sentry judges that the master is abnormal (whether it is a network problem or not), it will ask other sentries. If multiple sentries (set a threshold) think that the master is abnormal, then it is determined that the master has indeed failed.

  4. After multiple sentinels are negotiated, it is determined that the master is faulty, and a master-slave switch is initiated

Therefore, we use multiple sentinels to negotiate with each other to determine the status of the master, so that the probability of misjudgment can be greatly reduced.

After the sentinel negotiation determines that the master is abnormal, there is another question: which sentinel will initiate the master-slave switchover?

The answer is to select a sentinel "leader" who will switch between master and slave.

Here comes the question again, how to choose this leader?

Imagine how elections are done in real life?

Yes, vote.

When electing sentinel leaders, we can formulate such an election rule:

  1. Each sentry asks the other sentries to vote for them

  2. Each sentinel only votes for the first sentry requesting a vote, and can only vote once

  3. The sentinel who gets more than half of the votes first is elected as the leader and initiates a master-slave switch

This election process is what we often hear: the " consensus algorithm " in the field of distributed systems.

What is a consensus algorithm?

We deploy sentries on multiple machines, and they need to work together to complete a task, so they form a "distributed system".

In the field of distributed systems, the algorithm of how multiple nodes reach a consensus on a problem is called a consensus algorithm.

In this scenario, multiple sentinels negotiate together to elect a leader that they all recognize, which is done using a consensus algorithm.

This algorithm also stipulates that the number of nodes must be an odd number, which can ensure that even if a node in the system fails, more than "half" of the remaining nodes are in normal state and can still provide correct results. That is to say, this algorithm is also compatible There is a case of a failed node.

There are many consensus algorithms in the field of distributed systems, such as Paxos, Raft, and the scenario where sentinels elect leaders. The Raft consensus algorithm is used because it is simple enough and easy to implement.

Ok, let's make a summary here.

Your Redis has been optimized from the simplest stand-alone version through data persistence, master-slave multi-copy, and sentinel cluster. Your Redis performance and stability are getting higher and higher. Don't worry.

Deployed in such an architectural mode, Redis can basically run stably for a long time.

...

With the development of time, your business volume has begun to usher in explosive growth. At this time, can your architecture model still bear such a large traffic?

Let's analyze it together:

  1. Fear of data loss : persistence (RDB/AOF)

  2. Long recovery time : master-slave copy (the copy can be cut at any time)

  3. Long manual switching time : sentinel cluster (automatic switching)

  4. Read pressure : expansion copy (separation of read and write)

  5. Writing is under pressure : what should I do if a mater can't handle it?

It can be seen that the remaining problem now is that when the amount of write requests increases, a master instance may not be able to bear such a large amount of write traffic.

To perfectly solve this problem, you need to consider using "sharded clusters" at this time.

Sharded Clusters: Scale Out

What is a "sharded cluster"?

To put it simply, one instance cannot bear the pressure of writing, so can we deploy multiple instances, and then organize these instances according to certain rules, treat them as a whole, and provide services to the outside world, so that we can solve the problem of centralized writing of one instance Is the bottleneck problem?

So, the current architecture model becomes like this:

Now the question comes again, how to organize so many instances?

We formulate the rules as follows:

  1. Each node stores a part of data separately, and the sum of all node data is the full amount of data

  2. Make a routing rule, for different keys, route it to a fixed instance for reading and writing

The data is stored in multiple instances, and the routing rules for finding the key need to be done on the client side, specifically as follows:

This solution is also called "client sharding". The disadvantage of this solution is that the client needs to maintain the routing rules, that is, you need to write the routing rules into your business code.

How to avoid coupling routing rules to client business code?

To continue optimizing, we can add an "intermediate proxy layer" between the client and the server. This proxy is the proxy we often hear, and the routing and forwarding rules are placed in this proxy layer for maintenance.

In this way, the client does not need to care about the number of Redis nodes on the server, but only needs to interact with the Proxy.

Proxy will forward your request to the corresponding Redis node according to the routing rules. Moreover, when the cluster instance is not enough to support larger traffic requests, it can also expand horizontally and add new Redis instances to improve performance. All this is for you For the client, it is transparent and imperceptible.

The industry's open source Redis sharding cluster solutions, such as Twemproxy and Codis, adopt this solution.

The advantage of this solution is that the client does not need to care about the data forwarding rules, but only needs to deal with the Proxy. The client operates the subsequent cluster like a stand-alone Redis, which is easy to use.

Architecture evolution So far, whether the routing rules are implemented by the client or the proxy, they are all sharding solutions evolved by the "community". Their characteristic is that the Redis nodes in the cluster do not know the existence of each other. Only the client or Proxy will coordinate where the data is written and read from, and they all rely on the sentinel cluster to be responsible for automatic failover.

In other words, we are actually combining multiple isolated Redis nodes for use.

Redis actually launched the "official" Redis Cluster sharding solution in 3.0, but due to the instability in the initial stage of the launch, few people use it, so various open source solutions have emerged in the industry, such as Twemproxy and Codis sharding mentioned above. It is against this background that the film program was born.

However, with the gradual maturity of the Redis Cluster solution, more and more companies in the industry have begun to adopt the official solution (after all, the official guarantees continuous maintenance, and Twemproxy and Codis have gradually given up maintenance). Even simpler, its schema is as follows.

Redis Cluster does not need to deploy sentinel clusters. The Redis nodes in the cluster detect each other's health status through the Gossip protocol, and can initiate automatic switching in case of failure.

In addition, regarding the routing and forwarding rules, there is no need for the client to write it by itself. Redis Cluster provides a "supporting" SDK. As long as the client upgrades the SDK, it can be integrated with Redis Cluster. The SDK will help you find the Redis node corresponding to the key. Read and write, and can also automatically adapt to the addition and deletion of Redis nodes, and the business side has no perception.

Although the deployment of sentinel clusters is omitted, the maintenance cost has been reduced a lot, but for the client to upgrade the SDK, the cost may not be high for new business applications, but for the old business, the "upgrade cost" is still relatively high , which has a lot of resistance to switching the official Redis Cluster solution.

As a result, various companies began to develop their own Proxies for Redis Cluster to reduce the cost of client upgrades, and the architecture became like this:

In this way, the client does not need to make any changes, it only needs to switch the connection address to the Proxy, and the Proxy is responsible for forwarding data and dealing with routing changes caused by adding or deleting nodes in the subsequent cluster.

So far, the industry's mainstream Redis sharding architecture has been formed. When you use sharding clusters, you can calmly face greater traffic pressure in the future!

Summarize

To sum up, how we built a stable and high-performance Redis cluster from 0 to 1, and then from 1 to N, from which you can clearly see the entire process of the evolution of the Redis architecture.

  1. Afraid of data loss  -> Persistence (RDB/AOF)

  2. Long recovery time  -> master-slave copy (copy can be cut at any time)

  3. Slow manual failover  -> sentinel cluster (automatic switchover)

  4. There is pressure on reading  -> expansion copy (separation of reading and writing)

  5. Write pressure/capacity bottleneck  -> Sharded cluster

  6. Fragmented Cluster Community Solution  -> Twemproxy, Codis (No communication between Redis nodes, need to deploy Sentry, can be expanded horizontally)

  7. Sharded cluster official solution  -> Redis Cluster (Gossip protocol between Redis nodes, no need to deploy sentinels, and can be scaled horizontally)

  8. It is difficult to upgrade the business side  -> Proxy + Redis Cluster (does not invade the business side)

So far, our Redis cluster has been able to provide services for our business with long-term stability and high performance.

I hope this article can help you better understand the evolution of the Redis architecture.

Guess you like

Origin blog.csdn.net/m0_72650596/article/details/126182244