Configuration and use of redis, data structure of redis and common problems encountered by caching

①. Cache: When there is a cache, the data in the cache will be queried first. If the query is not available, then the database will be queried. After querying the database, the queried information will be put into the cache

②, the significance of the existence of the cache:

a. Reduce the pressure on the database. (The data in the database is on the disk, and the cache is stored in the memory, and the reading speed of the memory is fast.) For example, there are 1,000 request parameters that are the same. If I don’t use the cache, I will access the database 1,000 times. With cache, it may be accessed once

b. Improve the performance of the interface (the performance is not enough, the cache can make it up) (the cache is faster than the hard disk)

③, cache is divided into three types:

a. Local cache: There are clients, such as WeChat chat records (very suitable for local cache) (when opening the chat window, it is definitely not to call the interface to check, but to read it from the local. When using the local cache, you should pay attention to security: The chat history needs to be encrypted)

b. Server cache: put it in the jvm heap, such as hashmap, key-value form

c. Distributed cache: The cluster cache uses separate redis, each cache is stored separately, and the distributed cache will put these caches into one redis

2. Redis can not only do caching, but most of its scenarios are for caching. After the local cache is restarted, the contents of the cache will disappear, but redis will.

3. Redis has several features: query is fast, but it is stored in memory (power off or restart, data will be lost), so it has a specific persistence mechanism

①. Snapshot form: timed snapshot to back up data to the hard disk. (Comparative performance consumption) Not suitable for frequent backup

②. Log form: (save the log to the hard disk) is similar to the binlog of mysql. (The recovery is relatively slow, and each recovery needs to query and reproduce many logs), not suitable for long-term backup Example: aa--bb aa --cc aa --dd aa

③ In the production environment, the two mechanisms are often combined. Snapshots are generated approximately every minute on average, and logs are generated for the rest. After 1 minute, the snapshots are deleted to generate logs

4. Install redis on the server (centos)

①, finalshell connects to the server

②, install docker and redis

Update the yum package: yum -y update

yum remove docker docker-common docker-selinux docker-engine

yum install -y yum-utils device-mapper-persistent-data lvm2

yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo

yum -y install docker-ce-18.03.1.ce

systemctl start docker

docker pull redis:latest

docker run -d -p 6379:6379 --name="myredis1" redis

docker exec -it myredis1 redis-cli

5. Redis is integrated in springboot

①, add dependencies

(IP + port number + password plus password)

②, to achieve

6. Usage scenarios

①. In the large-scale flash sale inventory deduction, the peak traffic on the app homepage can easily overwhelm the traditional relational database (mysql, oracle, etc.)

②. There is still a lot of data that does not need to be persisted, such as SMS verification codes, number of likes, etc.

③, distributed lock

④, distributed cache (session sharing)

7. Data structure

①. The storage of redis is stored in the form of key-value key-value pairs, where the key is of String type, and the common value is the following five types.

a、String

String type, can contain any data, the maximum can be 512MB, the internal implementation structure is similar to ArrayList, adopts the form of internal allocation redundancy, to reduce frequent allocation of memory (reduce CPU pressure)

struct SDS {

        // array capacity

        T capacity;

        // array length

        T len;

        // special flag

        byte flags;

        // array content

        byte[ ] buf;

}

That is, when creating a string, the length of len is the capacity. When it needs to be modified, if the storage capacity is not enough, it will expand the capacity. When the capacity of the string is less than 1mb, it will perform double expansion, that is, expand to 2* capacity, when the capacity is greater than 1MB, add 1MB each time.

common instructions

set name zhencong -- store string key-value pairs

mset name zhencong age 18 -- store key-value pairs in batches

SETNX name zhencong --If there is no key as name, then set value (the principle of distributed lock)

get name -- get the key

mget name age -- Get keys in batches

DEL key -- delete key

expire key 60 -- set the expiration time in seconds

INCR (23.890, 0.570, 2.44%) key -- add 1 to the number stored in the key

DECR key -- decrement the number stored in the key by 1

INCRBY key 2 -- add 2 to the value stored in the key

DECRBY key 2 -- subtract 2 from the value stored in the key

It should be noted that try to avoid operating a large number of keys at the same time, such as setting an expiration time for all keys, because redis is single-threaded, and if the operation takes too much time, it will cause redis to suspended animation (temporarily not providing external services)

scenes to be used

i. Data that does not require persistence or frequently updated data, such as verification codes and likes

ii. Object cache: Java objects can be cached through serialization tool classes, such as serializing an object into json, and then taken out and deserialized when needed. Common usage methods include mybatis secondary cache, interface level cache and so on.

iii. Use setnx to implement distributed locks (when using distributed locks, you must set an expiration time to prevent the lock from being released and causing deadlock)

iv. Incr and decr can be used to achieve the number of likes

v. Distributed global id: In a large-scale system, if it involves sub-database and sub-table, the auto-increment id of mysql will definitely not meet the needs. If the number of users is not large, you can get it from redis every time through auto-increment id, but if there are a large number of users, taking it every time will definitely cause pressure on redis. You can take 1000 at a time, put them in the local cache, and take them when they are used up.

b、Hash

It is a key-value key-value pair, similar to hashMap in java. When the amount of data is small, ziphash (default) is used, and when the amount of data is large, hashtable is used. As for what conversion can be configured in the configuration file.

hash-max-ziplist-entries 512 //Configure when the field-value exceeds 512 (1024 in total), use hashtable encoding

hash-max-ziplist-value 64 //Configure when the length of a single field or value of the key exceeds 64, use hashtable encoding

common command

hset hash name zhencong -- set value,

hget hash name -- get the value

hmset hash name zhencong age 18 --batch setting

hmget hash name age -- get in batches

hgetall hash Get all the values of the key

hkeys hash Get all the keys in the hashmap

hvals hash Get all the values in the hashmap

Application Scenario

i. It can be used to store the data of objects in the system.

ii. It can also be used as a cache to solve the problem of data consistency (not recommended).

c、List

The redis list is a combination of quickList (quick linked list), that is, multiple ziplists (compressed linked list). As shown in the figure: ziplist; when the capacity of the array is small, a continuous memory space will be opened up. Only when the capacity of the array is too large, it will be changed to quickList. The advantage of this is that if you use an ordinary linked list, When our node only stores int type data, we need to open up two pointers to connect the previous element and the next element of the node, which will waste space. Therefore, the quickList method is adopted, which can not only meet the performance of fast insertion and deletion, but also avoid too much waste of space.

There are also disadvantages to doing this, that is, when our list needs to be changed, it will definitely involve memory reallocation and data copying, which will greatly affect performance. The larger the list, the greater the cost of modifying elements, so generally we will not store it. multi-element.

The redis list is sorted in the order of insertion. A node that can be added to the head (head insertion) or tail (tail insertion) of the linked list is a two-way linked list. The operation performance on both ends will be relatively high, and the operation on the middle node The performance is relatively poor (because the corresponding nodes have to be traversed through pointer pairs).

common command

rpush myList valu5e1 -- add elements to the head (right) of the list

rpush myList value2 value3 -- add multiple elements to the head (rightmost) of the list

lpop myList # Take out the tail (leftmost) element of the list

lpush myList2 value1 --

scenes to be used

Stacks and queues can be implemented. It should be noted that the operations of push and pop are atomic, so when operating redis, just use it directly. Do not read out the list, modify it through java, and then put it back, so the data cannot be guaranteed consistency. (read first write or read first write)

d、Set

The set of redis is similar to the list, but it can be automatically deduplicated. (Java's set can also automatically deduplicate).

When you need to store a list without repeated data, you can choose set, and set can also determine whether a certain data is in the set.

The underlying structure of the set is a hash table whose value is null, which means that its time complexity is O(1), which means that even if there is more data, the search time is the same.

scenes to be used

Can be used to calculate the intersection or union of multiple data sources

e、SortedSet

Much like set, sortedSet is an ordered non-repeating list. Each node in the SortedSet is associated with a weight for sorting. (Each node in the collection is unique, but the score can be the same), using this feature we can use redis to implement the leaderboard. It is also possible to quickly obtain the nodes in an interval.

The bottom layer of SortedSet is hash and skip table (a very typical data organization, sacrificing space for time). The function of the hash is to store each node and its weight, and the function of the jump table is to quickly obtain the nodes in a range.

The data organizations commonly used by redis are the above five types, and some are not commonly used (bonus items)

scenes to be used

Real-time leaderboard of the live broadcast system

8. Advanced articles:

①、Geospatial

The abbreviation of geographic location can represent the two-dimensional coordinates of an area. Redis provides operations such as latitude and longitude setting, query, range query, distance query, and latitude and longitude hash.

scenes to be used

Can be used to calculate the nearest store

②, BloomFilter (Bloom filter)

A Bloom filter is a long binary vector and a series of random mapping functions used to quickly retrieve whether an element is in a set. But his accuracy rate is not 100%, and it is possible to make a mistake in judgment. So he doesn't fit in the zero-turnover scenario.

Advantages: i. Supports massive data (19.04, -0.70, -3.55%) scenarios to determine whether an element exists.

ii, the storage space is small, and the data itself is not stored, but the hash value is stored

iii, does not store the data itself, can be used to store encrypted data

Disadvantages: Counting is not supported, the same element can be inserted multiple times, and the effect is the same.

Use scenarios: i. Used to solve the problem of cache penetration;

ii. It can determine whether a user has read an article to prevent repeated pushes, such as Douyin.

9.Common configuration items of redis

port	Port number, default 6379
bind	The host address, which can access the ip of redis
timeout	How long should the connection be closed when the connection is idle, indicating that the client should close the connection after being idle for a period of time. If it is specified as 0, it means that the duration of the connection is not limited. The default value of this option is 0, which means that the idle time of the connection is not limited by default.
dbfilename	Specify the local file name to save the cache data, the default value is dump.rdb.
dir	Specifies the directory where the local files for saving cached data are stored. The default value is the installation directory
rdbcompression	Specify whether to compress data when storing cached data to local files, the default is yes. Redis uses LZF compression. In order to save CPU time, this option can be turned off, but it will cause the local file to become huge.
requirepass	Set Redis connection password
slaveof	In master-slave replication mode, if the current node is a Slave (slave) node, set it to the IP address and port of the Master (master) node, and automatically synchronize data from the Master (master) node when Redis starts. If it is already a Slave (slave) server, the old data set will be discarded, and the cached data will be synchronized from the new Master master server.
masterauth	In the master-slave replication mode, when the Master (primary) server node is password protected, the Slave (slave) server uses this command to set the password for connecting to the Master (primary) server. The command format for setting the password of the Master server node is:

10. Caching common problems:

①. What is cache penetration, the problems caused by cache penetration, and how to solve cache penetration?

a. Cache penetration: For example, my key is a number (123), but network attackers frequently use strings (abc) to obtain the cache. As a result, the cache can never be hit and the database is directly queried. The meaning of caching is to reduce database pressure.

b. Solution: Bloom filter

② What is cache breakdown, the problems caused by cache breakdown, and how to solve cache breakdown?

a. Cache breakdown: For example, my official website data is hot data. When the concurrency is very high, such as when registering for the college entrance examination, the official website data cache expires. At this time, the database will be queried directly, and the meaning of caching will be lost.

b. Solution: For some very high-frequency hotspot data, no expiration time is set. And start the scheduled task to periodically check whether the cache has been deleted. If the cache does not exist, update the cache. Not setting an expiration time can only ensure that redis will not be deleted, but it cannot guarantee that other services may be deleted, so you need to enable scheduled tasks to update the cache when the cache is deleted by others.

③. What is cache avalanche, the problems caused by cache avalanche, and how to solve cache avalanche?

a. Cache avalanche: A large number of keys fail at the same time, causing requests to hit the database

b. Solution: Make a reasonable plan for the expiration time of the key. For high-frequency data (defined by yourself, do you think this data is high-frequency), do not set the expiration time