The 50 most complete Redis interview questions in history

1. What is Redis?

1233356-411b28101df9cf30.png

Redis is essentially a Key-Value type in-memory database, much like memcached. The entire database is loaded into memory for operation, and the database data is periodically flushed to the hard disk for storage through asynchronous operations. Because it is a pure memory operation, Redis has excellent performance, can handle more than 100,000 read and write operations per second, and is the fastest known Key-Value DB.

The excellence of Redis is not just performance. The biggest charm of Redis is that it supports saving multiple data structures. In addition, the maximum limit of a single value is 1GB. Unlike memcached, which can only save 1MB of data, Redis can be used to achieve many useful Function, for example, use his List as a FIFO doubly linked list, realize a lightweight high-performance message queue service, use his Set to make a high-performance tag system, and so on. In addition, Redis can also set the expire time for the stored Key-Value, so it can also be used as an enhanced version of memcached.

The main disadvantage of Redis is that the database capacity is limited by physical memory and cannot be used for high-performance reading and writing of large amounts of data. Therefore, Redis's suitable scenarios are mainly limited to high-performance operations and operations with small data volumes.

2. What are the advantages of Redis compared to memcached?
(1) All values ​​of memcached are simple strings, and redis as its replacement supports more abundant data types

(2) Redis is much faster than memcached

(3) Redis can persist its data

3. Which data types does Redis support?

String、List、Set、Sorted Set、hash

1233356-d19fe40444c6c3e2.png
1233356-e95e374874be57ee.png

4. What physical resources does Redis mainly consume?

RAM.

5. What is the full name of Redis?

Remote Dictionary Server。

6. What kinds of data elimination strategies does Redis have?

noeviction: returns an error when the memory limit is reached and the client attempts to execute a command that causes more memory to be used (most of the write instructions, but DEL and a few exceptions)

allkeys-lru: Attempt to recycle the least used keys (LRU) to make room for newly added data.

volatile-lru: Attempt to recycle the least used key (LRU), but only the keys in the expired collection, so that the newly added data has space to store.

allkeys-random: Reclaim random keys to make room for newly added data.

volatile-random: Recycling random keys allows the newly added data to have room for storage, but only for keys in expired collections.

volatile-ttl: Recover the keys in the expired collection, and preferentially recover the keys with a short survival time (TTL), so that the newly added data has space to store.

7. Why doesn't Redis official provide Windows version?
Because the current Linux version is quite stable, and the number of users is large, there is no need to develop a windows version, but it will bring compatibility problems.

8. What is the maximum storage capacity of a string value?

512M

9. Why does Redis need to put all data in memory?

In order to achieve the fastest read and write speed, Redis reads the data into the memory and writes the data to the disk in an asynchronous manner. So redis has the characteristics of fast and data persistence. If you do not put the data in memory, the disk I / O speed will seriously affect the performance of redis. Today, with memory becoming cheaper, redis will become more and more popular.

If the maximum memory used is set, new data cannot be inserted after the number of data records reaches the memory limit.

10. What should I do with the Redis cluster solution? What are the options?

1233356-bb169156abc438d5.png
1233356-1ee4cd53b5af524b.png

(1) .twemproxy, the general concept is that it is similar to a proxy method, and the method of use is no different from ordinary redis. After setting up multiple redis instances under it, use twemproxy in the place where redis needs to be connected , It will receive the request as a proxy and use the consistent hash algorithm to transfer the request to a specific redis and return the result to twemproxy. Easy to use (relative to redis only need to modify the connection port), the first choice for the expansion of old projects. Problem: The pressure of twemproxy's own single-port instance. After using the consistent hash, the data cannot be automatically moved to the new node when the calculated value changes when the number of redis nodes changes.

1233356-34c4866917330863.png

(2) .codis, the most commonly used cluster solution, basically has the same effect as twemproxy, but it supports the restoration of the old node data to the new hash node when the number of nodes changes.

1233356-6714613cea87b352.png

(3). The cluster that comes with redis cluster3.0 is characterized by his distributed algorithm is not a consistent hash, but the concept of hash slot, and its own support node to set the slave node. See the official documentation for details.

1233356-fd778d7aeb1f56eb.png

(4). Implemented in the business code layer, starting a few unrelated redis instances, in the code layer, hash the key, and then go to the corresponding redis instance to manipulate the data. This method has relatively high requirements on the hash layer code, and some of the considerations include alternative algorithm solutions after node failure, automatic script recovery after data shock, instance monitoring, and so on.

11. Under what circumstances will the Redis cluster solution cause the entire cluster to be unavailable?

There is a three-node cluster of A, B, and C. If there is no replication model, if node B fails, then the entire cluster will think that the lack of slots in the range of 5501-11000 is unavailable.

12. There are 2000w data in MySQL, and only 20w data is stored in redis. How to ensure that the data in redis are all hot data?

When the size of the redis memory data set rises to a certain size, a data elimination strategy will be implemented.

13. What are the suitable scenarios for Redis?

(1) Session Cache

The most commonly used scenario for using Redis is the session cache. The advantage of using Redis to cache sessions over other storage (such as Memcached) is that Redis provides persistence. When maintaining a cache that does not strictly require consistency, if the user's shopping cart information is lost, most people will be unhappy. Now, will they still be like this?

Fortunately, as Redis has improved over the years, it is easy to find how to properly use Redis to cache session documents. Even the well-known business platform Magento provides Redis plugins.

(2) Full page cache (FPC)

In addition to basic session tokens, Redis also provides a very simple FPC platform. Going back to the consistency issue, even if the Redis instance is restarted, users will not see a drop in page loading speed because of the persistence of the disk. This is a huge improvement, similar to PHP local FPC.

Taking Magento as an example again, Magento provides a plugin to use Redis as a full-page cache backend.

In addition, for WordPress users, Pantheon has a very good plug-in wp-redis, this plug-in can help you load the pages you have visited the fastest speed.

(3) Queue

One of the advantages of Reids in the field of memory storage engines is to provide list and set operations, which makes Redis a good message queue platform to use. The operation of Redis as a queue is similar to the push / pop operation of list in a local programming language (such as Python).

If you quickly search for "Redis queues" in Google, you will immediately find a large number of open source projects. The purpose of these projects is to use Redis to create very good back-end tools to meet various queue requirements. For example, Celery has a background that uses Redis as a broker, you can check it from here.

(4) Leaderboard / Counter

Redis performs very good increment or decrement operations on numbers in memory. Sets and Sorted Sets also make it very simple for us to perform these operations. Redis just provides these two data structures. So, we need to get the top 10 users from the sorted collection-we call it "user_scores", we just need to execute as follows:

Of course, this assumes that you are sorting in increasing order based on your user's score. If you want to return the user and the user's score, you need to perform this:

ZRANGE user_scores 0 10 WITHSCORES

Agora Games is a good example, implemented in Ruby, and its leaderboard uses Redis to store data, as you can see here.

(5) Publish / Subscribe

Last (but certainly not the least) is Redis' publish / subscribe function. There are indeed many use cases for publish / subscribe. I have seen people use it in social network connections, and they can also be used as publish / subscribe script triggers, and even use Redis' publish / subscribe function to build chat systems! (No, this is true, you can verify it).

14. What are the Java clients supported by Redis? Which is the official recommendation?

Redisson, Jedis, lettuce, etc., the official recommendation is Redisson.

15. What is the relationship between Redis and Redisson?

Redisson is an advanced distributed coordination Redis client, which can help users easily implement some Java objects in a distributed environment (Bloom filter, BitSet, Set, SetMultimap, ScoredSortedSet, SortedSet, Map, ConcurrentMap, List, ListMultimap, Queue, BlockingQueue, Deque, BlockingDeque, Semaphore, Lock, ReadWriteLock, AtomicLong, CountDownLatch, Publish / Subscribe, HyperLogLog).

16. What are the advantages and disadvantages of Jedis compared with Redisson?

Jedis is a client of Redis's Java implementation. Its API provides comprehensive support for Redis commands. Redisson implements a distributed and extensible Java data structure. Compared with Jedis, the function is simpler and does not support string operations. Redis features such as sorting, transactions, pipes, and partitions are not supported. The purpose of Redisson is to promote the separation of users' attention to Redis, so that users can focus more on processing business logic.

17. How does Redis set the password and verify the password?
Set password: config set requirepass 123456

Authorization password: auth 123456

18. Talk about the concept of Redis hash slot?
The Redis cluster does not use consistent hashing, but introduces the concept of hash slots. The Redis cluster has 16384 hash slots. Each key passes the CRC16 check and modulo 16384 to determine which slot to place. Each node of the cluster Responsible for some hash slots.

19. What is the master-slave replication model of Redis cluster?
In order to make the cluster still available when some nodes fail or most nodes cannot communicate, the cluster uses a master-slave replication model, and each node will have N-1 replicas.

20. Will the Redis cluster lose write operations? why?
Redis does not guarantee strong consistency of data, which means that in practice the cluster may lose write operations under certain conditions.

21. How are Redis clusters replicated?

Asynchronous replication

22. What is the maximum number of nodes in the Redis cluster?

16,384.

23. How to choose database in Redis cluster?
The Redis cluster cannot currently select a database. The default is 0 database.

24. How to test Redis connectivity?

ping

25. What is the use of pipes in Redis?
A request / response server can handle new requests even if the old requests have not yet been responded to. This allows multiple commands to be sent to the server without waiting for a reply, and finally reads the reply in one step.

This is pipelining, a technology that has been widely used for decades. For example, many POP3 protocols already support this feature, greatly speeding up the process of downloading new mail from the server.

26. How to understand Redis transactions?
A transaction is a separate isolation operation: all commands in a transaction are serialized and executed sequentially. During the execution of a transaction, it will not be interrupted by command requests sent by other clients.

A transaction is an atomic operation: the commands in the transaction are either all executed or none at all.

27. How many commands are related to Redis transactions?
MULTI, EXEC, DISCARD, WATCH

28. How to set the expiration time and permanent validity of Redis key?
EXPIRE and PERSIST commands.

29. How does Redis do memory optimization?
Use hash tables (hashes) whenever possible. The hash table (that is, the number of stored in the hash table) uses very little memory, so you should abstract your data model into a hash table as much as possible. For example, if you have a user object in your web system, do not set a separate key for the user's name, surname, email, and password, but store all the user's information in a hash table.

30. How does the Redis recycling process work?
A client ran new commands and added new data.

Redi checks the memory usage and if it is greater than the maxmemory limit, it will be recycled according to the set strategy.

A new command is executed, and so on.

So we constantly cross the boundary of the memory limit, by constantly reaching the boundary and then constantly recycling back to the boundary.

If the result of a command causes a large amount of memory to be used (for example, the intersection of a large set is saved to a new key), it will not be long before the memory limit is exceeded by this memory usage.

31. What algorithm does Redis use?
LRU algorithm

32. How does Redis do a lot of data insertion?
Redis 2.6 started redis-cli to support a new mode called pipe mode for performing large amounts of data insertion.

33. Why do Redis partitions?
Partitioning allows Redis to manage more memory, and Redis will be able to use the memory of all machines. If there is no partition, you can only use the memory of one machine at most. Partitioning makes Redis's computing power multiplied by simply increasing the computer. Redis' network bandwidth will also increase exponentially with the increase of computers and network cards.

34. Do you know which Redis partition implementation solutions?
The client partition means that the client has already decided to which redis node the data will be stored or read from. Most clients have implemented client partitioning.

Agent partitioning means that the client sends the request to the agent, and then the agent decides which node to write to or read from. The agent decides which Redis instances to request according to the partition rules, and then returns to the client according to the Redis response results. A proxy implementation of redis and memcached is Twemproxy

Query routing means that the client randomly requests any redis instance, and then Redis forwards the request to the correct Redis node. Redis Cluster implements a mixed form of query routing, but instead of directly forwarding requests from one redis node to another redis node, it is directly redirected to the correct redis node with the help of the client.

35. What are the disadvantages of Redis partition?
Operations involving multiple keys are usually not supported. For example, you cannot find the intersection of two sets, because they may be stored in different Redis instances (in fact, there are ways in this case, but you cannot directly use the intersection instruction).

If you operate multiple keys at the same time, you cannot use Redis transactions.

The granularity used for partitioning is key, and a very long sorting key cannot be used to store a dataset (The partitioning granularity is the key, so it is not possible to shard a dataset with a single huge key like a very big sorted set).

When using partitions, data processing is very complicated. For example, you must collect RDB / AOF files from different Redis instances and hosts at the same time for backup.

Dynamic expansion or reduction during partitioning can be very complicated. The Redis cluster adds or deletes Redis nodes when it is running, which can achieve maximum transparent data rebalancing to users, but some other client partition or proxy partition methods do not support this feature. However, there is a pre-fragmentation technique that can better solve this problem.

36. How to expand the capacity of Redis persistent data and cache?
If Redis is used as a cache, use consistent hashing to achieve dynamic scaling.

If Redis is used as a persistent storage, a fixed keys-to-nodes mapping relationship must be used, and the number of nodes cannot be changed once determined. Otherwise (in the case where Redis nodes need to change dynamically), you must use a system that can rebalance data at runtime, and currently only Redis clusters can do this.

37. Is distributed Redis done in the early stage or is it better when the scale is up? why?
Since Redis is so lightweight (single instance only uses 1M of memory), to prevent future expansion, the best way is to start more instances at the beginning. Even if you only have one server, you can let Redis run in a distributed manner from the beginning, use partitions, and start multiple instances on the same server.

Set up a few more Redis instances at the beginning, such as 32 or 64 instances. For most users, this operation may be more troublesome, but it is worth making this sacrifice in the long run.

In this case, when your data keeps growing and you need more Redis servers, all you need to do is to migrate Redis instances from one service to another server (without considering the problem of repartitioning). Once you add another server, you need to migrate half of your Redis instances from the first machine to the second machine.

38. What is Twemproxy?
Twemproxy is a (cached) proxy system maintained by Twitter, proxying Memcached's ASCII protocol and Redis protocol. It is a single-threaded program, written in C language, which runs very fast. It is open source software using Apache 2.0 license.
Twemproxy supports automatic partitioning. If one of the Redis nodes of its proxy is unavailable, the node will be automatically excluded (this will change the mapping of the original keys-instances, so you should only use Twemproxy when Redis is cached).
Twemproxy itself does not have a single point of problem, because you can start multiple Twemproxy instances, and then let your client connect to any Twemproxy instance.
Twemproxy is an intermediate layer between the Redis client and server. It should not be complicated to handle the partition function and should be relatively reliable.

39. Which clients support consistent hashing?
Redis-rb, Predis, etc.

40. How is Redis different from other key-value stores?

1233356-9b8db68de9d62e22.png

Redis has more complex data structures and provides atomic operations on them, which is an evolutionary path different from other databases. The data types of Redis are based on basic data structures and are transparent to programmers without additional abstraction.

Redis runs in memory but can be persisted to disk, so it is necessary to weigh the memory when reading and writing different data sets at high speed. The amount of data should not be greater than the hardware memory. Another advantage of in-memory databases is that compared to the same complex data structures on disk, it is very simple to operate in memory, so Redis can do many things with strong internal complexity. At the same time, in terms of disk format, they are compact and generated in an additional manner, because they do not require random access.

41. What is the memory usage of Redis?
To give you an example: 1 million key-value pairs (the key is 0 to 999999 value is the string "hello world") used 100MB on my 32-bit Mac notebook. It takes only 16MB to put the same data into a key, because the key value has a large overhead. Execution on Memcached is a similar result, but the overhead is relatively small compared to Redis, because Redis will record the type information reference count and so on.

Of course, the ratio of the big key to the time is much better.

A 64-bit system requires more memory overhead than a 32-bit system, especially when the key-value pairs are small. This is because pointers occupy 8 bytes in a 64-bit system. However, of course, 64-bit systems support larger memory, so in order to run a large Redis server, more or less need to use 64-bit systems.

42, what are the ways to reduce the memory usage of Redis?
If you are using a 32-bit Redis instance, you can make good use of collection type data such as Hash, list, sorted set, set, etc., because usually many small Key-Values ​​can be stored together in a more compact way.

43. What commands are used to view Redis usage and status information?
info

44. What happens when Redis runs out of memory?
If the upper limit is reached, the Redis write command will return an error message (but the read command can still return normally.) Or you can use Redis as a cache to use the configuration elimination mechanism. When Redis reaches the upper memory limit, it will wash out the old content.

45. Redis is single-threaded, how to increase the utilization of multi-core CPU?
You can deploy multiple Redis instances on the same server and use them as different servers. At some point, one server is not enough anyway,
so if you want to use multiple CPUs, you can consider Shard.

46. ​​How many keys can a Redis instance store? List, Set, Sorted Set How many elements can they store at most?
In theory, Redis can handle up to 232 keys, and has been tested in practice. Each instance stores at least 250 million keys. We are testing some larger values.

Any list, set, and sorted set can hold 232 elements.

In other words, the storage limit of Redis is the available memory value in the system.

47. Common performance problems and solutions of Redis?
(1) Master is best not to do any persistent work, such as RDB memory snapshots and AOF log files

(2) If the data is more important, a Slave opens AOF to back up the data, and the policy is set to synchronize once per second

(3) For the speed of master-slave replication and the stability of the connection, Master and Slave are best in the same LAN

(4) Try to avoid adding slaves to the stressful master library

(5) Master-slave replication should not use graph structure, but use one-way linked list structure to be more stable, namely: Master <-Slave1 <-Slave2 <-Slave3 ...

Such a structure is convenient for solving the problem of single-point failure and realizing the replacement of Master by Slave. If the Master hangs, you can immediately enable Slave1 to become the Master, and the other will remain unchanged.

48. What kinds of persistence methods does Redis provide?
The RDB persistence method can take snapshot storage of your data at specified time intervals.

The AOF persistence method records each write operation to the server.When the server restarts, these commands will be re-executed to restore the original data.The AOF command uses the redis protocol to append and save each write operation to the end of the file. The file is rewritten in the background so that the size of the AOF file is not too large.

If you only want your data to exist while the server is running, you can also not use any persistence methods.

You can also enable two persistence methods at the same time. In this case, when redis restarts, it will first load the AOF file to restore the original data, because under normal circumstances, the AOF file saves the data set than the RDB file. The data set must be complete.

The most important thing is to understand the difference between RDB and AOF persistence, let us start with RDB persistence.

49. How to choose a suitable persistence method?
In general, if you want to achieve data security comparable to PostgreSQL, you should use both persistence functions. If you are very concerned about your data, but can still withstand data loss within a few minutes, then you can just use RDB persistence.

Many users only use AOF persistence, but this method is not recommended: regular RDB snapshots (snapshots) are very convenient for database backups, and the speed of RDB recovery of data sets is faster than that of AOF recovery, except In addition, the use of RDB can also avoid the aforementioned AOF program bugs.

50. Will it take effect in real time if the configuration is modified without restarting Redis?
For the running example, there are many configuration options that can be modified by the CONFIG SET command without performing any form of restart. Starting with Redis 2.2, you can switch from AOF to RDB for snapshot durability or other methods without restarting Redis. Retrieve the 'CONFIG GET *' command for more information.

However, occasional restart is necessary, such as to upgrade the Redis program to a new version, or when you need to modify some configuration parameters that are not currently supported by the CONFIG command.

1665 original articles published · 1067 praised · 750,000 views

Guess you like

Origin blog.csdn.net/universsky2015/article/details/105242570