Redis Chat (1): Building a Knowledge Graph

Scenario: Redis Interview

insert image description here(The picture comes from the Internet)

Interviewer : I saw on your resume that you are proficient in using Redis, so what is Redis used for?

Xiao Ming : (I am delighted, isn't Redis a cache?) Redis is mainly used as a cache to efficiently store non-persistent data through memory.

Interviewer : Can Redis be used as persistent storage?

Xiao Ming : Hmm... it should be possible...

Interviewer : How does Redis perform persistence operations?

Xiao Ming : Well...not too clear.

Interviewer : What are the memory elimination mechanisms of Redis?

Xiao Ming : Hmm...I don't know

Interviewer : What else can we do with Redis? Which instruction of Redis is used?

Xiao Ming : I only know that Redis can also do distributed locks, message queues...

Interviewer : Alright, let's move on to the next topic...

Thinking : Obviously, Xiao Ming's performance and answer about Redis in the interview process must be relatively unsuccessful. Redis is something we use every day in our work. Why does it become a lost item when it comes to the interview?

As developers, we are accustomed to using the things that have been packaged by the great gods to ensure that we can focus more on business development, but we do not know what the underlying implementation of these commonly used tools are. Still can't impress the interviewer.

This article summarizes some knowledge points of Redis, there are principles and applications, and I hope it can help everyone.

1. What is Redis

REmote DIctionary Server (Redis) is a key-value storage system written by Salvatore Sanfilippo.

Redis is an open source log-type, Key-Value database written in ANSI and C, complying with the BSD protocol, supporting the network, memory-based and persistent, and providing APIs in multiple languages.

Here I quote the description of Redis in the Redis tutorial, which is official but standard. Log-type, Key-Value database that can be based on memory or persistent. I think this description is apt and comprehensive.

1.1 Industry Status of Redis

Redis is the most widely used storage middleware in the field of Internet technology. It is widely praised in terms of storage due to its ultra-high performance, perfect documentation, multi-faceted application capabilities and rich and complete client support, especially for its performance and readability. Taking speed has become the most popular middleware in the field. Basically every software company uses Redis, including many large Internet companies, such as JD.com, Ali, Tencent, github, etc. Therefore, Redis has also become an essential skill for backend developers.

1.2 Knowledge Graph

In my opinion, learning every technology requires a clear context and structure, otherwise you will not know what you have learned and how much you have not learned. Like a book, if it doesn't have a table of contents chapters, it loses its soul.

Therefore, I tried to summarize the knowledge map of Redis, also known as the brain map. As shown in the figure below, the knowledge points may not be very complete, and will be updated and supplemented in the future.

Knowledge Graph

The knowledge points of this series of articles will also be basically the same as this brain map. This article first introduces the basic knowledge of Redis, and subsequent articles will introduce Redis' data structure, application, persistence and other aspects in detail.

Second, the advantages of Redis

2.1 Fast

As a caching tool, the most well-known feature of Redis is that it is fast. How fast is it? Redis single-machine qps (concurrency per second) can reach 110,000 times/s, and the write speed is 81,000 times/s. So, why is Redis so fast?

  • The vast majority of requests are pure memory operations, which are very fast;
  • Many data structures with very fast lookup operations are used for data storage, and the data structures in Redis are specially designed. Such as HashMap, the time complexity of searching and inserting is O(1);
  • Using a single thread avoids unnecessary context switching and competition conditions, and there is no CPU consumption due to switching caused by multi-process or multi-threading. There is no need to consider various lock issues. Performance consumption caused by possible deadlocks;
  • A non-blocking I/O multiplexing mechanism is used.

2.2 Rich data types

Redis has 5 commonly used data types: String, List, Hash, set, zset, each of which has its own usefulness.

2.3 Atomicity, supporting transactions

Redis supports transactions, and all its operations are atomic, and Redis also supports the atomic execution of several operations combined.

2.4 Rich Features

Redis has rich features, such as it can be used as a distributed lock; it can persist data; it can be used as a message queue, leaderboard, counter; it also supports publish/subscribe, notification, key expiration, etc. When we use middleware to solve practical problems, Redis can always play its own role.

3. Comparison of Redis and Memcache

Memcache and Redis are both excellent and high-performance in-memory databases. Generally, when we talk about Redis, we will compare Memcache with Redis. (Why make a comparison? Of course, it is to bring out how good Redis is. Without comparison, there is no harm~) The aspects of comparison include:

  1. storage method
  • Memcache stores all the data in the memory, it will hang up after a power failure, the data persistence cannot be achieved, and the data cannot exceed the memory size.

  • Redis has some data stored on the hard disk, which can achieve data persistence.

  1. Type of data support
  • Memcache's support for data types is relatively simple, and only supports String type data structures.

  • Redis has rich data types, including: String, List, Hash, Set, Zset.

  1. The underlying model used
  • The underlying implementation between them and the application protocol for communication with the client are different.

  • Redis directly builds the VM mechanism by itself, because the general system calls system functions, it will waste a certain amount of time to move and request.

4) The size of the stored value

  • Redis can store up to 1GB, while memcache is only 1MB.

Seeing this, do you think that Redis is particularly good, all advantages and perfect? In fact, Redis still has many shortcomings. How do we usually overcome these shortcomings?

4. Problems and solutions of Redis

4.1 The problem of double-write consistency of cache database

Problem : The problem of consistency is a very common problem in distributed systems. Consistency is generally divided into two types: strong consistency and final consistency. When we want to meet strong consistency, Redis cannot be perfect, because the database and cache are double-written, and there will definitely be inconsistencies. Redis only Eventual consistency is guaranteed.

Solution : How do we guarantee eventual consistency?

  • The first way is to set a certain expiration time for the cache. After the cache expires, the database will be automatically queried to ensure the consistency between the database and the cache.

  • If you do not set the expiration time, we must first select the correct update strategy: first update the database and then delete the cache. But there may be some problems when we delete the cache, so we need to put the key of the cache to be deleted in the message queue, and keep retrying until the deletion is successful.

4.2 Cache Avalanche Problem

Question: We should have all seen an avalanche in the movies, and it starts out peacefully, then in an instant, it starts to collapse, very devastating. The same is true here. When we execute the code, we set the effective time of many caches to be the same, and then these caches will be effective at the same time, and then they will re-access the database to update the data, which will lead to too many database connections and excessive pressure. And crash.

solve:

  • Add a random value when setting the cache expiration time.
  • Set double cache, cache 1 sets the cache time, cache 2 does not set, directly return to cache 2 after 1 expires, and start a process to update caches 1 and 2.

4.3 Cache penetration problem

Problem: Cache penetration means that some abnormal users (hackers) deliberately request data that does not exist in the cache, causing all requests to be concentrated on the database, resulting in abnormal database connection.

solve:

  • Take advantage of mutex locks. When the cache is invalid, the database cannot be accessed directly, but the lock must be obtained before requesting the database. If the lock is not obtained, try again after sleeping for a period of time.

  • Adopt an asynchronous update strategy. Regardless of whether the key gets a value, it returns directly. A cache expiration time is maintained in the value value. If the cache expires, a thread is asynchronously started to read the database and update the cache. Need to do cache warm-up (before the project starts, load the cache first) operation.

  • Provides an interception mechanism that can quickly determine whether a request is valid. For example, a Bloom filter is used to internally maintain a series of valid and valid keys, and quickly determine whether the key carried in the request is valid and valid. If it is invalid, it will return directly.

4.4 Cache Concurrency Competition Issue

question:

The problem of cache concurrency competition mainly occurs when multiple threads set a key, and then there will be data inconsistency.

For example, in Redis, we store a value whose key is the amount, and its value is 100. Both threads add 100 to the value at the same time and then update it. The correct result should be 300. But when the two threads get this value, they are both 100, and the final result is 200, which leads to the problem of concurrent cache competition.

solve

  • If there is no order requirement for multi-threaded operations, we can set up a distributed lock, and then multiple threads compete for the lock. Whoever grabs the lock first can execute it first. This distributed lock can be implemented using zookeeper or Redis itself.
  • You can use Redis' incr command.
  • When our multi-threaded operations require sequence, we can set up a message queue, add the required operations to the message queue, and execute commands strictly in accordance with the sequence of the queue.

5. Redis expiration policy

With the increase of data in Redis, the memory usage will continue to increase. We thought that some keys will be deleted when they reach the set deletion time, but when the time is up, the memory usage is still very high. Why?

Redis uses a memory elimination mechanism of periodic deletion and lazy deletion .

5.1 Periodic deletion

There is a difference between periodic deletion and scheduled deletion:

  • Timed deletion means that the cache must be deleted strictly according to the set time, which requires us to set a timer to continuously poll all keys to determine whether deletion is necessary. However, in this case, the resources of the CPU will be greatly occupied, and the utilization rate of the resources will become lower. So we choose to use periodic deletion, .

  • The time for periodic deletion is determined by us. We can check every 100ms, but we still cannot check all the caches. Redis will still be stuck, and we can only check a part of the caches randomly, but some caches cannot be checked within the specified time. delete. This is where lazy deletion comes in handy.

5.2 Lazy delete

To give a simple example: When I was in middle school, I usually had too much homework and couldn't finish it. The teacher said that this paper will be taught in the next class. Have you all finished it? In fact, many people have not finished it, so they need to make up before the next class.

The same is true for lazy deletion. Our value should be gone, but it is still there. When you want to get the key, you find that the key should be expired, delete it quickly, and return a 'without this value, already expired!'.

Now that we have an expiration policy of regular delete + lazy delete, can we sit back and relax? This is not the case. If the key has not been accessed, it will remain unreasonable, which requires our memory elimination mechanism.

5.3 Redis memory elimination mechanism

There are generally 6 types of memory elimination mechanisms in Redis, as shown in the following figure:

insert image description here

So how do we configure the memory elimination mechanism of Redis?

In Redis.conf we can configure

# maxmemory-policy allkeys-lru

6. Summary

This article first explores Redis, and roughly sorts out the knowledge map of Redis. By comparison, we can find that Redis has so many knowledge points to learn; then we analyze the advantages and disadvantages of Redis, and know its memory-based efficient read and write speed and The rich data types also analyze how Redis handles data consistency, cache penetration, and cache avalanches. Finally, we understand Redis' expiration policy and cache elimination mechanism.

I believe that everyone has some understanding of Redis. In the next article, we will analyze the data structure of Redis, how each data type is implemented, and what are the corresponding commands.

Author: Yang Heng

Source: CreditEase Institute of Technology

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324164154&siteId=291194637