Redis graphic guide

1. What is Redis?

Redis (REmote DIctionary Service) is an open source key-value database server.

Redis is more accurately described as a data structure server. This special property of Redis makes it popular among developers.

Redis does not process data by iteration or sorting, but organizes it according to the data structure from the beginning. In the early days, it was used much like Memcached, but as Redis improved, it became feasible in many other use cases, including publish-subscribe mechanisms, streaming, and queues.

Basically, Redis is an in-memory database used as a cache in front of another "real" database (such as MySQL or PostgreSQL) to help improve application performance. It offloads core application databases by taking advantage of the high access speeds of memory, such as:

  • Data that changes infrequently and is frequently requested

  • Data that is less mission critical and changes frequently

Examples of such data may include session or data caches and leaderboard or aggregated analytics for dashboards.

However, for many use-case scenarios, Redis can provide enough guarantees that it can be used as a full-fledged primary database. Coupled with the Redis plugin and its various high availability (HA) settings, Redis as a database becomes very useful for certain scenarios and workloads.

Another important aspect is that Redis blurs the line between caching and data storage. The important thing to understand here is that in-memory data can be read and manipulated much faster than a traditional database using an SSD or HDD for storage.

Originally, Redis was most often compared to Memcached, which at the time lacked any non-volatile persistence.

Here's the current feature breakdown between the two caches.

Although there are now multiple configuration methods to persist data to disk, when persistence was first introduced, Redis used snapshots to achieve persistence by asynchronously copying data in memory. Unfortunately, the downside of this mechanism is the possibility of data loss between snapshots.

Redis has become very mature since its establishment in 2009. We'll cover most of its architecture and topology so you can add Redis to your arsenal of data storage systems.

2. Redis architecture

Before we start discussing Redis internals, let's discuss the various Redis deployments and their trade-offs.

We will mainly focus on these settings:

  • A single Redis instance

  • Redis High Availability

  • Redis Sentry

  • Redis Cluster

Decide which setting to use based on your use case and scale.

A single Redis instance

A single Redis instance is the most straightforward way to deploy Redis. It allows users to set up and run small instances, helping them grow and accelerate services quickly. However, this deployment is not without its drawbacks. For example, if this instance fails or becomes unavailable, all client calls to Redis will fail, reducing the overall performance and speed of the system.

This instance can be very powerful given enough memory and server resources. Scenarios primarily used for caching may see significant performance gains with minimal setup. Given sufficient system resources, you can deploy this Redis service on the same machine where your application is running.

Understanding some Redis concepts is essential when it comes to managing data within your system. Commands sent to Redis are first processed in memory. Then, if persistence is set up on these instances, at some interval there will be a fork process that generates a data-persistent RDB (a very compact point-in-time representation of Redis data) snapshots or AOF (append-only files) .

These two processes allow Redis to have long-term storage, support various replication strategies, and enable more complex topologies. If Redis is not set up to persist data, data will be lost on restarts or failovers. If persistence is enabled on restart, it loads all data from the RDB snapshot or AOF back into memory, and the instance can then support new client requests.

With that said, let's look at some more distributed Redis setups you might use.

Redis High Availability

Another popular setting for Redis is the master-slave deployment mode, where the slave deployment maintains data synchronization with the master deployment. When data is written to the master instance, it sends a copy of these commands to the slave deployment client output buffer to achieve data synchronization. There can be one or more instances from a deployment. These instances can help scale Redis read operations or provide failover in case main is lost.

We have now entered a distributed system, so there are many new things to consider in this topology. Things that were simple before are now complicated.

Redis replication

Each master instance of Redis has a replication ID and an offset. These two pieces of data are critical to determining the point at which a replica can continue its replication process or whether it needs to do a full sync. This offset is incremented for every operation that happens on the main Redis deployment.

More specifically, when a Redis replica instance is only a few offsets behind the primary instance, it receives the remaining commands from the primary instance and then replays them on its data set until synchronization is complete. If the two instances cannot agree on the replication ID, or the primary instance does not know the offset, the replica will request a full synchronization. At this point the master instance creates a new RDB snapshot and sends it to the replica.

Between this transfer, the master instance buffers all intermediate update instructions between the snapshot expiration and the current offset so that they are sent to the replica instance after the snapshot is synchronized. Once this is complete, copying can continue normally.

If an instance has the same replication ID and offset, they have exactly the same data. Now you may be wondering why you need to copy the ID. When a Redis instance is promoted to master or restarted from scratch as a master, it is given a new replication ID.

This is used to infer which previous master instance this new promoted replica instance was copied from. This allows it to be able to perform partial syncs (with other replica nodes), as the new master remembers its old replication ID.

For example, two instances (master and slave) have the same replication ID, but the offsets differ by a few hundred commands, which means that if the commands following these offsets are replayed on the instance, they will have the same data set. Now, if the replication IDs are completely different, and we don't know the previous replication ID of the newly demoted (or rejoined) slave node (no common ancestor). We would need to perform an expensive full sync.

In contrast, if we know the previous replication ID, we can infer how to bring the data into sync, since we are able to infer the common ancestor they share, and the offsets make sense again for partial synchronization.

Redis Sentinel

Sentinel is a distributed system. Like all distributed systems, Sentinel has several advantages and disadvantages. Sentinel is designed in such a way that a set of sentinel processes work together to coordinate state to provide high availability for Redis. After all, you don’t want the system that protects you from failure to have its own single point of failure.

Sentinel takes care of a few things. First, it ensures that the current master and slave instances are functioning properly and responding. This is necessary so that Sentinel (together with other Sentinel processes) can alert and take action in the event of loss of the master and/or slave nodes. Second, it plays a role in service discovery, just like Zookeeper and Consul in other systems. So when a new client tries to write to Redis, Sentinel tells the client what the current primary instance is.

So Sentinel constantly monitors availability and sends that information to clients so that they can react to it when they do fail over.

Here are its responsibilities:

  • Monitoring - Ensure that master and slave instances are working as expected.

  • Notification - Notifies the system administrator of events in the Redis instance.

  • Failover management - Sentinel nodes can initiate a failover if the primary instance becomes unavailable and enough (quorum) nodes agree that this is true.

  • Configuration Management - The Sentinel node also acts as a discovery service for the current master Redis instance.

Using Redis Sentinel in this way enables failure detection. This detection involves multiple sentinel processes agreeing that the current master instance is no longer available. This agreement process is called Quorum. This improves robustness and prevents one machine from misbehaving and rendering the master Redis node inaccessible.

This setup is not without its drawbacks, so we'll cover some recommendations and best practices when using Redis Sentinel.

You can deploy Redis Sentinel in a variety of ways. Honestly, to make any sensible suggestions, I'd need more background information on your system. As a general guide, I recommend running a sentinel node next to each application server (if possible) so that you also don't need to account for network reachability differences between the sentinel node and the clients actually using Redis.

You can run Sentinel with a Redis instance, or even on a standalone node, but it will be handled differently, which makes things more complicated. I recommend running at least three nodes with at least two quorums. Here's a simple diagram that breaks down the number of servers in a cluster with the associated quorum and sustainable failures tolerable.

This will vary from system to system, but the general idea remains the same.

Let's take a moment to think about what could go wrong with such a setup. If you run this system long enough, you'll encounter all of these.

  1. What if there is a quorum of sentinel nodes?

  2. What if a network split puts the old master instance in the minority? What happens to these writes? (Spoiler: they will be lost when the system is fully restored)

  3. What happens if the network topology of sentinel nodes and client nodes (application nodes) is misaligned?

There are no durability guarantees, especially since the operation of persisting to disk (see below) is asynchronous. There is also a troublesome question, when the client discovers the new primary, how much do we lose writing to an unknown primary? Redis recommends querying the new master when establishing a new connection. Depending on the system configuration, this could mean significant data loss.

If you force the primary instance to replicate writes to at least one replica instance, there are several ways to mitigate the extent of the damage. Remember that all Redis replication is asynchronous, which has its trade-offs. Therefore, it needs to track acknowledgments independently, and if at least one replica instance does not acknowledge them, the master instance will stop accepting writes.

Redis Cluster

I'm sure many of you have thought about what happens when you can't store all your data in memory on one machine. Currently, the maximum RAM available in a single server is 24TIB, which is currently listed online by AWS. Sure, that's a lot, but for some systems, it's not enough, even for the caching layer.

Redis Cluster allows horizontal expansion of Redis.

First, let's get some terminology out of the way; once we decided to use Redis Cluster, we decided to spread the data we store across multiple machines, which is called sharding. So each Redis instance in the cluster is considered a shard of the entire data.

This brings up a new problem. If we push a key to the cluster, how do we know which Redis instance (shard) holds the data? There are several ways to do this, but Redis Cluster uses algorithmic sharding.

To find the shard for a given key, we hash the key modulo the total number of shards. Then, using a deterministic hash function, meaning that a given key will always map to the same shard, we can infer where future reads of a specific key will occur.

What happens when we later want to add a new shard to the system? This process is called resharding.

Suppose the key 'foo' was previously mapped to shard 0, after the new shard is introduced it may be mapped to shard 5. However, if we need to scale the system quickly, moving data to reach new shard mappings will be slow and impractical. It also adversely affects the availability of the Redis cluster.

Redis Cluster has designed a solution for this problem called Hashslot, and all data is mapped to it. There are 16K hash slots. This gives us a reasonable way to spread data across the cluster, and as we add new shards we simply move hash slots between systems. By doing this, we only need to move the hashlot from one shard to another and simplify the process of adding new master instances to the cluster.

This can be achieved without any downtime and minimal performance impact. Let's talk through an example.

  • M1 contains hash slots from 0 to 8191.

  • M2 contains hash slots from 8192 to 16383.

So, to map "foo", we take a deterministic hash of the key (foo) and modify it by the number of hash slots (16K), resulting in a map of M2. Now suppose we add a new instance M3. The new mapping will be:

  • M1 contains hash slots from 0 to 5460.

  • M2 contains hash slots from 5461 to 10922.

  • M3 contains hash slots from 10923 to 16383.

All keys that map hash slots in M1 that are now mapped to M2 need to be moved. But the hashes for the individual keys of the hash slot do not need to be moved because they are already partitioned into the hash slot. Therefore, this level of misdirection solves the re-sharding problem of algorithmic sharding.

Gossiping protocol

Redis Cluster uses gossiping to determine the health of the entire cluster. In the above diagram, we have 3 M nodes and 3 S nodes. All these nodes are constantly communicating to know which shards are available and ready to serve requests.

If enough shards agree that M1 is unresponsive, they can decide to promote M1's replica S1 to master to keep the cluster healthy. The number of nodes required to trigger this operation is configurable and must be performed correctly. If done incorrectly and failing to break a tie when both sides of the partition are equal, it can result in the cluster being split. This phenomenon is called split-brain. As a general rule, you must have an odd number of masters and two replicas for the most robust setup.

3. Redis persistence model

If we are going to use Redis to store any type of data while requiring safe preservation, it is important to understand how Redis does this. In many use cases, if you lose data stored in Redis, it's not the end of the world. Use it as a cache or where it supports real-time analytics, it's not the end of the world if data loss occurs.

In other scenarios, we want to have some guarantees around data durability and recovery.

no persistence

No Persistence: Persistence can be completely disabled if you wish. This is the fastest way to run Redis and has no durability guarantees.

RDB file

RDB (Redis Database): RDB persists point-in-time snapshots of datasets at specified intervals.

The main disadvantage of this mechanism is that data is lost between snapshots. In addition, this storage mechanism also relies on forks of the main process, which can cause momentary delays in serving requests in larger data sets. Having said that, RDB files load in memory much faster than AOF.

AOF

AOF (Append Only File): AOF persistently records every write operation received by the server. These operations will be executed again when the server starts to reconstruct the original data set.

This persistence method ensures more durability than RDB snapshots because it is an append-only file. As operations occur, we buffer them to the log, but they are not persisted yet. This log is consistent with the actual command we ran so it can be replayed if needed.

We then use fsync to flush it to disk if possible (when this run is configurable) and it will be persisted. The disadvantage is that the format is not compact and uses more disk than RDB files.

Why not have both?

RDB + AOF: AOF and RDB can be combined in the same Redis instance. Trading speed for persistence is a compromise, if you will. I think this is an acceptable way to set up Redis. In case of a reboot, remember that if both are enabled, Redis will use AOF to reconstruct the data since it is the most complete.

Forking

Now that we understand the types of persistence, let's discuss how we can actually do it in a single-threaded application like Redis.

In my opinion, the coolest part of Redis is how it leverages forking and copy-on-write to efficiently facilitate data persistence.

Forking is a way for the operating system to create a new process by creating a copy of itself. This way, you will get a new process ID and some other information and handle, so the newly forked process (child process) can communicate with the original process parent.

Now things get interesting. Redis is a process that allocates a lot of memory, so how does it copy without running out of memory?

When you fork a process, the parent and child processes share memory, and Redis starts the snapshot (Redis) process in the child process. This is achieved through a memory sharing technique called copy-on-write - which passes a reference to the memory when creating a fork. If no changes have occurred while the child process is persisting to disk, no new allocations will be made.

In the event of a change, the kernel keeps track of references to each page, and if a page has multiple changes, the changes are written to a new page. The child process is completely unaware of changes and has a consistent memory snapshot. As a result, we are able to get a point-in-time snapshot of potential gigabytes of memory very quickly and efficiently, while using only a small fraction of the memory!

Guess you like

Origin blog.csdn.net/LinkSLA/article/details/132595570