redis learning encyclopedia (notes)

Redis overview

Redis is an open source written in ANSI C, contains a variety of data structures, supports network, memory-based, optional persistent key-value pair storage database.

characteristic:

  1. Based on memory operation, high performance
  2. Support distributed, theoretically unlimited expansion
  3. key-value storage system
  4. Open source written in ANSI C language, complies with BSD protocol, supports network, can be memory-based or persistent log type, Key-Value database, and provides APIs in multiple languages

NoSQL (NoSQL = Not Only SQL), which means "not just SQL", generally refers to non-relational databases. With the rise of Internet web2.0 websites, traditional relational databases have become incapable of coping with ultra-large-scale and high-concurrency purely dynamic websites, exposing many insurmountable problems.

structured and unstructured data

  • Structured data refers to data that is logically expressed and realized by a two-dimensional table structure, strictly follows the data format and length specifications, and is also called row data.
  • Unstructured data refers to irregular or incomplete data structure, without any predefined data model, data that is inconvenient to be represented by two-dimensional logical tables, such as office documents (Word), text, pictures, HTML, various Class reports, video and audio, etc.

Four categories of NoSQL

KV type NoSql (redis)

As the name suggests, KV-type NoSql is a non-relational database stored in the form of key-value pairs. The biggest advantage of KV-type NoSql is high performance. Using the BenchMark that comes with Redis for benchmarking, the TPS can reach 100,000 levels, and the performance is very strong.

  • Data is based on memory, with high read and write efficiency
  • KV type data, the time complexity is O(1), and the query speed is fast

Columnar NoSql (HBase)

Columnar NoSql is one of the most representative technologies in the era of big data, represented by HBase.

  • Only the specified columns will be read when querying, not all columns will be read
  • Column data is organized together, and one disk IO can read one column of data into memory at one time

Document NoSql (MongoDB)

Document-type NoSql refers to a NoSql that stores semi-structured data as documents. Document-type NoSql usually stores data in JSON or XML format. Relational databases are stored in a column for each field step by step. In MongDB, it is stored as a JSON string

Search-type NoSql (ElasticSearch)

Traditional relational databases mainly use indexes to achieve the purpose of fast query, but in the context of full-text search, indexes are powerless. Firstly, like query cannot meet all fuzzy matching requirements, and secondly, the usage restrictions are too large and improper use may cause slow Query, search-type NoSql was born to solve the problem of weak full-text search capabilities of relational databases. ElasticSearch is a representative product of search-type NoSql

The difference between relational and non-relational

Relational

advantage:

  • Easy to maintain: all use the table structure and the format is consistent;
  • Easy to use: SQL language is common and can be used for complex queries;
  • Complex operations: Support SQL, which can be used for very complex queries between one table and multiple tables.

shortcoming:

  • The reading and writing performance is relatively poor, especially the high-efficiency reading and writing of massive data;
  • Fixed table structure, less flexibility

non-relational

The most typical data structure of a relational database is a table, a data organization composed of two-dimensional tables and their connections

advantage:

  • Flexible format: The format of stored data can be in the form of key, value, document, picture, etc. It is flexible to use and has a wide range of application scenarios, while relational databases only support basic types.
  • Fast speed: nosql can use hard disk or random access memory as a carrier, while relational database can only use hard disk;
  • High scalability;
  • Low cost: nosql databases are easy to deploy and are basically open source software.

shortcoming:

  • Does not provide sql support, and the cost of learning and using is relatively high;
  • no transactions;
  • The data structure is relatively complex, and the complex query is slightly lacking

Getting Started with Redis

Redis is a dictionary-structured storage server. A Redis instance provides multiple dictionaries for storing data, and the client can specify which dictionary to store the data in. Redis supports 16 databases by default. You can modify this value by adjusting the databases in the Redis configuration file redis/redis.conf. After setting, restart Redis to complete the configuration.

Does Redis use multi-thread or single-thread?

Because Redis is a memory-based operation, the CPU is not the bottleneck of Redis. The bottleneck of Redis is most likely the size of the machine memory or the network bandwidth. Since single-threading is easy to implement, and the CPU will not become a bottleneck, it is logical to adopt a single-threaded solution. redis uses network IO multiplexing technology to ensure high throughput of the system when there are multiple connections

IO multiplexing technology

img

Redis data types

String

String is the most basic type of Redis, and a key corresponds to a value. String is binary safe, meaning that String can contain

Any data, such as a serialized object or an image. String can hold up to 512M data.

scenes to be used:

  • value can be a number in addition to a string.
  • counter
  • Count the number of multiple units
  • Number of fans
  • object cache storage
  • distributed lock

List

List is simply a list of strings, sorted in insertion order, adding an element to the head (left) and tail (right) of the list. The bottom layer is a doubly linked list, which has extremely high operating performance on both ends, and the performance of nodes in the middle through index operations is poor.

img

A List can contain up to 2 to the power of 32 - 1 element (more than 4 billion elements per list)

scenes to be used:

  • message queue
  • leaderboard
  • latest list

SET

The function is similar to List, but Set is automatically rearranged. If you store a list of data and do not want duplicate data, Set is a good choice. Set is an unordered collection of String type. Its underlying layer is a hash table whose value is null, so the time complexity of adding, deleting, and searching is O(1).

scenes to be used:

  • Black and white list
  • random display
  • friends
  • follow people
  • fan
  • collection of interested people

Hash

Hash is a collection of key-value pairs. Hash is a String type Field (field) and Value (value). Hash is especially suitable for storing objects.

Hash structure optimization:

  • If the number of fields is small, the storage structure is optimized as a type array
  • If the number of fields is large, the storage structure is optimized to the HashMap structure

scenes to be used:

  • shopping cart
  • storage object

ZSet

ZSet is very similar to Set. It is a String collection without repeated elements. The difference is that each element of ZSet is associated with a score (Score), which is used to sort the set from low score to high score. The elements of the set are unique, but the fractions can be repeated.

scenes to be used:

  • delay queue
  • leaderboard
  • Limiting

BitMaps

BitMaps itself is not a data structure, it is actually a string, but it can operate on the bits of the string. "Rational use of bits can effectively improve memory usage and development efficiency."

scenes to be used:

  • active days
  • Check-in days
  • Login days
  • user sign in
  • Count active users
  • Count whether the user is online
  • Implement Bloom filter

Geospatia

GEO, Geographic, geographic information, this type is the two-dimensional coordinates of the element, which is the latitude and longitude on the map. Based on this type, Redis provides common operations such as longitude and latitude setting, query, range query, distance query, and longitude and latitude Hash.

scenes to be used:

  • cinema nearby
  • nearby friends
  • The nearest hot pot restaurant

Redis configuration file

units

img

Configure the size unit, define the basic measurement unit at the beginning, only support bytes, case insensitive

INCLUDES

Redis has only one configuration file. If multiple people develop and maintain, then multiple such configuration files are needed. At this time, multiple configuration files can be configured here through include /path/to/local.conf, while the original redis The .conf configuration file acts as a master gate.

img

NETWORK

img

parameter:

  1. bind: bind the redis server network card IP, the default is 127.0.0.1, which is the local loopback address. In this case, access to the redis service can only be connected through a local client, not through a remote connection. If the bind option is empty, then all connections from available network interfaces will be accepted.
  2. port: Specify the port on which redis runs, the default is 6379. Since Redis is a single-threaded model, the port will be modified when multiple Redis processes are opened on a single machine.
  3. timeout: Set the timeout time when the client connects, in seconds. When the client does not issue any instructions within this period, the connection is closed. The default value is 0, which means not closed.
  4. tcp-keepalive: The unit is seconds, which means that it will periodically use SO_KEEPALIVE to detect whether the client is still in a healthy state, so as to avoid the server from being blocked all the time. The official suggested value is 300s. If it is set to 0, it will not be detected periodically

GENERAL

img

Detailed explanation:

  1. daemonize: set to yes to specify that Redis starts as a daemon process (background start). The default value is no
  2. pidfile: Configure the PID file path. When redis is running as a daemon process, it will write the pid to the /var/redis/run/redis_6379.pid file by default.
  3. loglevel : Defines the log level. The default value is notice, and there are 4 values ​​as follows:
  • debug (record a large amount of log information, suitable for development and test phases)
  • verbose (more log information)
  • notice (appropriate amount of log information, used in production environment)
  • warning (only some important and key information will be recorded)
  1. logfile: Configure the log file address, which is printed on the window of the command line terminal by default
  2. databases: Set the number of databases. The default database is DB 0, you can use the select command on each connection to select a different database, dbid is a value between 0 and databases - 1. The default value is 16, which means that Redis has 16 databases by default.

SNAPSHOTTING

img

parameter:

  • save: This is used to configure the persistence conditions that trigger Redis, that is, when to save the data in the memory to the hard disk
  • save 900 1: If at least one key value changes within 900 seconds, save
  • save 300 10: Indicates that if at least 10 key values ​​change within 300 seconds, save
  • save 60 10000: means if at least 10000 key values ​​change within 60 seconds, save

REPLICATION

img

parameter:

  • slave-serve-stale-data: The default value is yes. When a slave loses contact with the master, or when replication is in progress, the slave may behave in two ways:
  1. If yes, the slave will still respond to client requests, but the returned data may be outdated, or the data may be empty at the time of the first synchronization
  2. If it is no, when you execute other commands except info he salveof, the slave will return a "SYNC with master in progress" error
  • slave-read-only: Configure whether the Slave instance of Redis accepts write operations, that is, whether the Slave is a read-only Redis. The default value is yes.
  • repl-diskless-sync: whether to use the diskless copy function for master-slave data replication. The default value is no.
  • repl-diskless-sync-delay: When no hard disk backup is enabled, the server will wait for a period of time before transmitting the RDB file to the slave station through the socket. This waiting time is configurable.
  • repl-disable-tcp-nodelay: Whether to disable TCP_NODELAY on the slave after synchronization If you choose yes, redis will use a smaller amount of TCP packets and bandwidth to send data to the slave.

CLIENTS

img

parameter:

maxclients : Set the maximum number of concurrent client connections. The default is unlimited. The number of client connections that Redis can open at the same time is the maximum file that the Redis process can open. The number of descriptors -32 (the redis server itself will use some), if you set maxclients to 0. Indicates no restriction. When the number of client connections reaches the limit, Redis will close the new connection and return the max number of clients reached error message to the client

MEMORY MANAGEMENT

img

parameter:

  • maxmemory: Set the maximum memory of Redis, if set to 0. Indicates no restriction. It is usually used together with the maxmemory-policy parameter described below.

  • maxmemory-policy : When the memory usage reaches the maximum value set by maxmemory, the memory clearing policy used by redis. There are several options:

    • 1) volatile-lru uses the LRU algorithm to remove the key with the expiration time set (LRU: recently used Least Recently
    • Used )
    • 2) allkeys-lru uses the LRU algorithm to remove any key
    • 3) volatile-random removes the random key whose expiration time has been set
    • 4) allkeys-random remove random keys
    • 5) volatile-ttl removes the key that is about to expire (minor TTL)
    • 6) noeviction noeviction does not remove any key, just returns a write error, the default option
  • maxmemory-samples : LRU and minimal TTL algorithms are not precise algorithms, but relatively precise algorithms (in order to save memory). You can choose the sample size for inspection at will. Redis selects 3 samples for inspection by default, and you can set the number of samples through maxmemory-samples.

APPEND ONLY MODE

img

parameter:

  • appendonly: By default, redis uses the rdb method for persistence, which is sufficient for many applications. However, if redis goes down midway, it may cause a few minutes of data loss. According to the save strategy for persistence, Append Only File is another persistence method that can provide better persistence features. Redis will write each written data into the appendonly.aof file after receiving it, and each time it starts, Redis will first read the data of this file into the memory, and ignore the RDB file first. The default value is no.
  • appendfilename: aof file name, the default is "appendonly.aof"
  • appendfsync: configuration of aof persistence strategy; no means not to execute fsync, and the operating system ensures that the data is synchronized to the disk, with the fastest speed; always means to execute fsync every write to ensure that the data is synchronized to the disk; everysec means to execute every second One fsync may cause the loss of this 1s data

LUA SCRIPTING

img

lua-time-limit: The maximum execution time of a lua script, in ms. The default value is 5000.

REDIS CLUSTER

img

parameter:

  1. cluster-enabled: Cluster switch, the cluster mode is not enabled by default.
  2. cluster-config-file: The name of the cluster configuration file.
  3. cluster-node-timeout : The configurable value is 15000. Threshold of node interconnection timeout, cluster node timeout milliseconds
  4. cluster-slave-validity-factor : A value of 10 can be configured.

SECURITY

requirepass: Set the redis connection password.

Redis publish and subscribe

Redis publish-subscribe (pub/sub) is a message communication mode: the sender (pub) sends the message, and the subscriber (sub) receives the message.

img

//订阅语法格式
subcribe 主题名字
//示例:
127.0.0.1:6379> SUBSCRIBE channel-1 
Reading messages... (press Ctrl-C to quit) 
1) "subscribe" 
2) "channel-1" 
3) (integer) 1
	
//发布命令
publish channel-1 hello
//示例:(打开另一个客户端,给channel1发布消息hello)
127.0.0.1:6379> PUBLISH channel-1 hello 
(integer) 1 //返回的1是订阅者数量。

//打开第一个客户端可以看到发送的消息
127.0.0.1:6379> SUBSCRIBE channel-1 
	Reading messages... (press Ctrl-C to quit) 
	1) "subscribe" 
	2) "channel-1" 
	3) (integer) 1 
	1) "message" 
	2) "channel-1" 
	3) "hello"

//发布的消息没有持久化,如果在订阅的客户端收不到hello,只能收到订阅后发布的消息。

Redis slow query

The process of Redis command execution

img

  • Slow queries occur in phase 3
  • Client timeouts are not necessarily slow queries, but slow queries are a possible factor in client timeouts
  • The slow query log is stored in the Redis memory list

slow query log

The slow query log is the log that the Redis server calculates the execution time of each command before and after the command is executed, and records it when it exceeds a certain threshold. The log records the time when the slow query occurred, as well as the execution time, specific commands and other information, which can be used to help development and operation and maintenance personnel locate the slow query that exists in the system.

Get slow query log

You can use the slowlog get command to obtain slow query logs. You can also add a number after slowlog get to specify the number of slow query logs to be obtained. For example, to obtain 3 slow query logs:

Configure slow query log

  1. Specifies the threshold slowlog-log-slower-than for command execution time. The function of slowlog-log-slower-than is to specify the threshold of the execution time of the command. When the execution time of the command exceeds this threshold, it will be recorded.
  2. The number of slow query logs stored slowlog-max-len. The role of slowlog-max-len is to specify the maximum number of entries stored in the slow query log. In fact, Redis uses a list to store slow query logs, and slowlog-max-len is the maximum length of the list.
View slow query log
127.0.0.1:0>config get slow*
 1)  "slowlog-max-len"
 2)  "128"
 3)  "slowlog-log-slower-than"
 4)  "10000"
	
10000阈值,单位微秒,此处为10毫秒,128慢日志记录保存数量的阈值,此处保存128条。
Modify the Redis configuration file

For example, set slowlog-log-slower-than to 1000 and slowlog-max-len to 1200:

  • slowlog-log-slower-than 1000
  • slowlog-max-len 1200

practical advice

slowlog-max-len**** configuration suggestion

  • It is recommended to increase the size of the slow query list online . When recording slow queries, Redis will truncate long commands and will not take up a lot of memory.
  • Increasing the list of slow queries can reduce the possibility of slow queries being eliminated . For example, it can be set to more than 1000 online.

slowlog-log-slower-than**** configuration suggestion

  • If the default value exceeds 10 milliseconds, it is judged as a slow query, and the value needs to be adjusted according to the Redis concurrency .
  • Since Redis uses a single thread to respond to commands, for high-traffic scenarios, if the command execution time is more than 1 millisecond, then Redis can support up to less than 1000 OPS. Therefore, the Redis recommendation for high OPS scenarios is set to 1 millisecond .

pipeline

Experienced 1 pipeline (n commands) = 1 network time + n command time, which greatly reduces the overhead of network time, this is the stream

waterline.

A network command communication model

Elapsed time = one network time + one command time

Batch network command communication model

Experienced n times = n times network time + n times command time

//引入jedis依赖包:
<dependency> 
	<groupId>redis.clients</groupId> 
	<artifactId>jedis</artifactId> 
	<version>2.9.0</version> 
</dependency

//没有pipeline命令
Jedis jedis - new Jedis("127.0.0.1",6379);
for ( int i = 0 ; i < 10000 ; i ++ ){
    
     
	jedis.hset("hashkey:" + i , "field" + i , "value" + i); 
}

/**
在不使用pipeline的情况下,使用for循环进行每次一条命令的执行操作,耗费的时间可能达到 1w
条插入命令的耗时为50s
*/

//使用pipeline
Jedis jedis = new Jedis("127.0.0.1",6379); 
for ( int i = 0; i < 100 ; i++) {
    
    
	Pipeline pipeline = jedis.ppipelined(); 
	for (int j = i * 100 ; j < (i + 1) * 100 ; j++) {
    
    
		pipeline.hset("hashkey:" + j,"field" + j, "value" + j);
	}
	pipeline.syncAndReturnAll();
}

Redis persistence mechanism

Since all redis data is stored in memory, if persistence is not configured, all data will be lost after restarting Redis, so restarting Redis requires enabling the persistence function. Save the data to the disk, and when Redis restarts, the data can be restored from the disk. "For Redis, the persistence mechanism refers to storing the data in the memory as a hard disk file, so that when Redis restarts or the server fails, the data can be restored according to the persistent hard disk file."

The significance of redis persistence lies in failure recovery . For example, a redis is deployed as a cache, and it can also save some important data.

img

Redis provides two different forms of persistence

RDB persistence is a full backup, which is time-consuming, so Redis provides a more efficient AOF (Append Only-file) persistence solution. Briefly describe its working principle: AOF logs store Redis server command sequences, AOF only records instruction records that modify memory.

  • RDB(Redis DataBase)
  • AOF(Append Only File)

RDB persistence mechanism

RDB: It refers to writing the snapshot of the data set in the memory to the disk within a specified time interval, and reading the snapshot directly into the memory when restoring. "snapshot - a compressed binary".

configure dump.rdb

The file saved by RDB, configure the file name in redis.conf, the default is dump.rdb. (The storage location of the rdb file can also be modified. By default, it is in the directory where the command line is located when Redis starts.)

The save path of the rdb file can also be modified. The default is the directory where the command line is located when Redis starts

img

Trigger mechanism - three main ways
RDB configuration

Snapshot default configuration:

  • save 3600 1: If at least one key value changes within 3600 seconds (one hour), save it.
  • save 300 100: If at least 100 key values ​​change within 300 seconds (five minutes), save.
  • save 60 10000: If at least 10000 key values ​​change within 60 seconds, save.

Configure new save rules

Add a new snapshot policy to redis.conf. If there are 5 key changes within 30 seconds, the snapshot will be triggered. After the configuration is modified, a restart is required

Redis service.

  • save 3600 1
  • save 300 100
  • save 60 10000
  • save 30 5
flushall

Executing the flushall command will also trigger the rdb rule.

save and bgsave

There are two commands to manually trigger Redis for RDB persistence:

\1. save This command will block the current Redis server. During the execution of the save command, Redis cannot process other commands until the RDB process is completed. It is not recommended to use it.

\2. When bgsave executes this command, Redis will perform snapshot operations asynchronously in the background, and snapshots can also respond to client requests.

advanced configuration
stop-writes-on-bgsave-error

The default value is yes. When Redis cannot write to the disk, directly close the write operation of Redis.

img

rdbcompression

The default value is yes. For snapshots stored on disk, you can set whether to perform compressed storage. If so, redis will use LZF calculation

method to compress. If you don't want to consume CPU for compression, you can set it to turn off this function, but the snapshot stored on disk

will be larger.

img

rdbchecksum

The default value is yes. After storing the snapshot, we can also let redis use the CRC64 algorithm for data verification, but this will increase

About 10% performance consumption, if you want to get the biggest performance improvement, you can turn off this function.

img

Data recovery

Just put the rdb file in the Redis startup directory, and when Redis starts, it will automatically load dump.rdb and restore the data.

Advantages and disadvantages

Suitable for large-scale data recovery

It is more suitable to use if the requirements for data integrity and consistency are not high

save disk space

fast recovery

Make a backup at a certain interval during the backup cycle, so if Redis goes down unexpectedly, the last snapshot will be lost

All subsequent modifications.

AOF persistence mechanism

Record each write operation in the form of a log, and record all write instructions executed by Redis. AOF is not enabled by default. You can configure the file name in redis.conf. The default is appendonly.aof.

img

The save path of the AOF file is consistent with the path of the RDB. If AOF and RDB are started at the same time, Redis reads the data of the AOF by default.

according to.

AOF start/repair/recovery

Set Yes: Modify the default appendonly no to yes

AOF synchronization frequency setting

img

parameter:

\1. appendfsync always is always synchronized, every Redis write will be recorded in the log immediately, the performance is poor but the data integrity is better.

\2. appendfsync everysec Sync every second, and record it in the log once every second. If there is a downtime, the data of this second may be lost.

\3. appendfsync no redis does not take the initiative to synchronize, and hand over the timing of synchronization to the operating system.

Advantages and disadvantages

advantage

The backup mechanism is more robust and the probability of data loss is lower.

Readable log text, robust through operation AOF, can handle misoperation.

shortcoming

Occupies more disk space than RDB.

Restoring backups is slower.

If every read and write is synchronized, there will be a certain performance pressure.

Choose Persistence

Don't just use ****RDB

RDB data snapshot files are generated every 5 minutes or longer. At this time, you have to accept that once the redis process goes down,

Then the data of the last 5 minutes will be lost.

Also don't just use ****AOF

\1. You use AOF for cold backup, without RDB for cold backup, and the recovery speed is faster.

\2. RDB simply and rudely generates data snapshots each time, which is more robust and can avoid the bugs of complex backup and recovery mechanisms such as AOF.

Comprehensive use of AOF and RDB two persistence mechanisms

Use AOF to ensure that data is not lost. As the first choice for data recovery, use RDB to do different degrees of cold backup, and all files will be lost in AOF

Or when it is damaged and unavailable, RDB can also be used for fast data recovery.

img

Advantages of RDBs:

1. Smaller size: the same amount of data rdb data is smaller than aof, because rdb is a compact file

2. Faster recovery: Because rdb is a snapshot of data, it is basically a copy of data, without re-reading and writing to memory

3. Higher performance: the parent process only needs to fork a child process when saving the RDB, and there is no need for the parent process to perform other io operations, which also ensures the performance of the server.

shortcoming:

1. Fault loss: Because rdb is full, usually use shell script to realize rdb backup to redis in 30 minutes or 1 hour or every day (note, you can also use the built-in strategy), but it takes at least 5 minutes One-time backup, so when the service dies, at least 5 minutes of data will be lost.

2. Poor durability: Compared with aof's asynchronous strategy, because the replication of rdb is full, even if the sub-process of fork is used for backup, the disk consumption cannot be ignored when the amount of data is large, especially when accessing When the volume is high, the fork time will also be extended, resulting in tight CPU and relatively poor durability.


Advantages of aof

1. Data guarantee: You can set the fsync policy. Generally, the default is everysec. You can also set every write to append, so even if the service dies, you will lose at most one second of data

2. Automatic shrinking: When the size of the aof file reaches a certain level, the background will automatically perform aof rewriting. This process will not affect the main process. After the rewriting is completed, the new writing will be written to the new aof , the old one will be deleted. However, if this article is compared with rdb, there is no need to count it as an advantage, but the official website shows it as an advantage.

shortcoming:

1. Relatively poor performance: its operating mode determines that it will consume redis performance

2. The volume is relatively larger: Although the aof file is rewritten, there is still a big difference between the operation process and the operation result after all, and the volume is also larger.

Redis transaction

The essence of a Redis transaction is a collection of commands. Transactions support executing multiple commands at a time, and all commands in a transaction will be serialized. During the transaction execution process, the commands in the queue will be serialized in order, and the command requests submitted by other clients will not be inserted into the transaction execution command sequence.

A redis transaction is a one-time, sequential, and exclusive execution of a series of commands in a queue.

Redis transactions have no concept of isolation level

Batch operations are put into the queue cache before the EXEC command is sent, and will not be actually executed, so there is no query in the transaction to see the update in the transaction, but the query outside the transaction cannot.

Redis does not guarantee atomicity

In Redis, a single command is executed atomically, but transactions do not guarantee atomicity and there is no rollback. If any command in the transaction fails to execute, the rest of the commands will still be executed.

Three phases of Redis transaction

  1. start business
  2. command enqueue
  3. Execute business

Redis transaction related commands

watch key1 key2 ... : 监视一或多个key,如果在事务执行之前,
				被监视的key被其他命令改动,则事务被打断 ( 类似乐观锁 )
multi : 	标记一个事务块的开始( queued )
exec : 		执行所有事务块的命令 ( 一旦执行exec后,之前加的监控锁都会被取消掉 ) 
discard : 	取消事务,放弃事务块中的所有命令
unwatch :	取消watch对所有key的监控

Atomicity, Consistency, Isolation, and Durability

  1. Atomicity: All operations in a transaction are either completed or not completed, and will not end in a certain link in the middle. If an error occurs during the execution of the transaction, it will be restored (Rollback) to the state before the transaction started, as if the transaction had never been executed.
  2. Consistency: The integrity of the database is not violated before the transaction begins and after the transaction ends. This means that the written data must fully comply with all preset rules, including data accuracy, seriality, and the subsequent database can spontaneously complete the scheduled work.
  3. Isolation: The ability of the database to allow multiple concurrent transactions to read, write and modify its data at the same time. Isolation can prevent data inconsistency caused by cross-execution when multiple transactions are executed concurrently. Transaction isolation is divided into different levels, including read uncommitted (Read uncommitted), read committed (read committed), repeatable read (repeatable read) and serialization (Serializable).
  4. Persistence: After the transaction processing ends, the modification to the data is permanent, even if the system fails, it will not be lost.

master-slave replication

img

Why use master-slave replication?

Machine failure: Deploy a Redis, when the server fails, it needs to be migrated to another server and data synchronization must be ensured.

Capacity bottleneck: When there is a need to expand the redis memory, 16G->64G, a single machine generally cannot meet it.

Solution

Copy multiple copies of data to other nodes for replication to achieve high availability of Redis and redundant backup of data to ensure high availability of data and services

Master-slave replication: refers to copying the data of one redis server to other redis servers. The former is called the master node (master), and the latter is called the slave node (slave). from the master node to the slave node.

master-slave replication

  1. Data redundancy: Realize hot backup of data, a data redundancy method other than persistence.
  2. Fault recovery: When there is a problem with the master node, the slave node can provide services to achieve rapid fault recovery; (actually, it is a kind of service redundancy)
  3. Load balancing: On the basis of master-slave replication, combined with read-write separation, the master node can provide write services, and the slave nodes can provide read services. (that is, the application connects to the master node when writing redis data, and the application connects to the slave node when reading redis data), sharing the server load;
  4. High availability: master-slave replication can also be the basis for sentinels and clusters to be implemented.

Sentinel Mode: An anti-customer-based automatic version that can automatically monitor whether the master fails. If there is a failure, one of the slaves will be selected as the master according to the number of votes.

Master-slave replication principle

Master-slave replication is divided into three stages and six processes

three phases

  1. connection building phase
  2. Data Synchronization Phase
  3. Command Propagation Phase

img

six processes

  1. Save master node (master) information
  2. Master-slave establishment: The slave node (slave) maintains the replication-related logic through a scheduled task that runs every second. When the scheduled task finds that there is a new master node, it will try to establish a network connection with the node. The slave node will establish a socket socket, and the slave node will establish a socket with port 51234 to receive the copy command sent by the master node.
  3. Send ping command: After the connection is successfully established, the slave node sends a ping request for the first communication
    • Check whether the network socket is available between master and slave
    • Check whether the master node can currently receive commands
  1. Authority verification: If the master node sets the requirepass parameter, password verification is required, and the slave node must configure the masterauth parameter to ensure the same password as the master node to pass the verification; if the verification fails, the replication will terminate, and the slave node will restart the replication process.
  2. Synchronize data sets: After the master-slave replication connection communicates normally, for the first replication scenario, the master node will send all the data it holds to the slave node. This part of the operation is the longest step
    • When the master-slave is just connected, full synchronization is performed; after full synchronization, incremental synchronization is performed. The slave can initiate full synchronization at any time. The redis strategy is that no matter what the situation is, it will first try to perform incremental synchronization. If it fails, the slave is required to perform full synchronization.
  1. Command continuous replication
    • When the master node synchronizes the current data to the slave node, the establishment process of replication is completed. Next, the master node will continue to send write commands to the slave nodes to ensure the consistency of master-slave data.

Sentinel monitoring

img

When the master of the host is down, it needs to be manually switched to re-select the master node

master-slave switching

When the master server goes down, a slave server needs to be switched to the master server, which requires manual intervention, which is not only troublesome, but also causes the service to be unavailable for a period of time. All have sentinel mode.

sentinel mode

Sentry mode is a special mode. Redis provides sentinel commands. Sentry is an independent process that can run independently as a process. The principle is that Sentinel monitors multiple running Redis instances by sending commands and waiting for the Redis server to respond.

img

Sentinel role

  • Cluster monitoring: responsible for monitoring whether the redis master and slave processes are working normally
  • Message notification: If a redis instance fails, the sentinel is responsible for sending a message as an alarm notification to the administrator
  • Failover: If the master node hangs up, it will automatically transfer to the slave node
  • Configuration Center: If a failover occurs, notify the client of the new master address

working principle

monitoring stage

  1. sentinel (sentinel 1)----->Initiate info to master (master) and slave (slave) to get full information.
  2. sentinel (sentinel 2) -----> send info to the master (master), you will know the information of the existing sentinel (sentinel 1), and connect to slave (slave).
  3. sentinel (sentinel 2) -----> initiate a subscribe to sentinel (sentinel 1).

notification phase

Sentinel continuously initiates notifications to the master and slave to collect information.

failover phase

In the notification phase, if the notification sent by sentinel does not get a response from the master, it will mark the master as SRI_S_DOWN, and send the status of the master to each sentinel. When other sentinels hear that the master has hung up, they say I don’t believe it. The results are shared with each sentinel. When half of the sentinels think that the master is down, they will mark the master as SRI_0_DOWN.

voting method

The sentinel who receives the election notice first will vote for it

failover

  1. The master-slave nodes of the sentinel system are no different from ordinary master-slave nodes. Fault discovery and transfer are controlled and completed with the sentinel.
  2. Sentinel nodes are essentially redis nodes.
  3. Each sentinel node only needs to configure the monitoring master node, and then it can automatically discover other sentinel nodes and slave nodes.
  4. During the sentinel node startup and failover phase, the configuration files of each node will be written from.

Sentry Mode Disadvantages

  • When the master hangs up, sentinel will elect a new master. Redis cannot be accessed during the election, and there will be a momentary disconnection.
  • In sentinel mode, only the master node can write externally, and the slave node can only be used for reading. Although a single redis node supports a QPS of up to 10W, the pressure to write data is on the master during festivals.
  • The Redis single-node memory cannot be set too large. If the data is too large, it will become very slow during master-slave synchronization, and it will take a long time when the node starts.

Cluster Cluster mode

Redis cluster has three modes (master-slave mode, Sentinel mode, Cluster mode). Redis cluster is a distributed service cluster composed of multiple master-slave node groups. It has the characteristics of replication, high availability and fragmentation.

img

advantage

  • There are multiple masters, which can reduce the impact of access transient problems
  • There are multiple masters, which can provide higher concurrency
  • It can be stored in slices, and more data can be stored

Redis cluster requires at least three nodes

principle

img

Redis Cluster divides all data into 16384 slots (slots), and each node is responsible for a part of the slots. The slot information is stored in each node, only the master node will be allocated slots, and the slave nodes will not be allocated slots.

Slot positioning algorithm: k1=127001

By default, Cluster will use the crc16 algorithm to hash the key value to get an integer value, and then use this integer value to modulo 16384 to get the specific slot

HASH_SLOT = CRC16(key)%16384

Redis split-brain

The split brain of Redis cluster means that due to network problems, the Redis Master node is in a different network partition from the Redis slave node and Sentinel cluster. At this time, because the Sentinel cluster cannot perceive the existence of the master, the slave node is promoted to the master node.

img

At this time, there are two different master nodes, just like one brain is split into two. In the cluster brain split problem, if the client continues to write data based on the original master node, the new master node will not be able to synchronize the data. When the network problem is solved, the sentinel cluster will downgrade the original master node to a slave node, and then synchronize data from the new master.

//redis.conf
min-replicas-to-write 1 
min-replicas-max-lag 5
	
//第一个参数表示最少的slave节点为1个
//第二个参数表示数据复制和同步的延迟不能超过5秒
//配置了这两个参数:如果发生脑裂原Master会在客户端写入操作的时候拒绝请求。这样可以避免大量数据丢失。

Redis cache warm up

There is no cached data in the newly started system, and the system performance and database load are not very good during the cache reconstruction process, so it is best to load the cached hotspot data into the cache before the system goes online. Preloading is cache warming. Cache preheating solves the downtime problem after database streaking

cache cold start

There is no data in the cache. Since there is no data in the cold start of the cache, if the service is provided directly to the outside world, MySQL will hang up when the concurrency increases.

img

Solution

  1. Fill some data into redis in advance, and then provide services.
  2. If the amount of data is very large, it cannot be said that all the data is written to redis, because the amount of data is too large, it will take a lot of time, and redis cannot accommodate so much data
  3. It is necessary to count the hot data with high access frequency in real time according to the specific access situation of the day, and then write these hot data into redis
  4. In the case of a large amount of hot data, multiple services can be used to read and write data in parallel, and parallel distributed cache preheating

cache penetration

Cache penetration refers to the data that does not exist in the cache or the database, but users continue to initiate requests, such as data with an id of "-1" or data with a particularly large id. At this time, the user request is likely to be an attacker, and the attack will cause excessive pressure on the database.

img

**Operation process:** The data queried by the user does not exist in the database, and naturally there will be no cache, so when the user queries, the data cannot be queried in the cache, and the user must go to the database to query again every time. Then returns null (equivalent to making two useless queries). In this way, the request bypasses the cache and directly queries the database (cache hit rate problem).

Solution

  1. Cache empty values: If the data returned by the request query is empty (whether the data exists or not), the empty result is still cached, and the expiration time of the empty result is set to be shorter, and the longest is no more than 5 minutes.
  2. Bloom filter: If you want to judge whether an element is in a set, the general idea is to save all the elements in the set, and then determine it by comparison.

bloom filter

Bloom filter is a data structure, a more ingenious probabilistic data structure (probability data structure), characterized by efficient insertion and query. (Bloom filter is a database structure, the bottom layer is a bit array)

//引入hutool包
<dependency> 
	<groupId>cn.hutool</groupId> 
	<artifactId>hutool-all</artifactId> 
	<version>5.7.17</version> 
</dependency>

//java代码实现
// 初始化 构造方法的参数大小10 决定了布隆过滤器BitMap的大小 

BitMapBloomFilter filter = new BitMapBloomFilter(10); filter.add("123"); 
filter.add("abc"); 
filter.add("ddd"); 
boolean abc = filter.contains("abc"); 
System.out.println(abc);

cache breakdown

For a certain hot key, when the cache expires, a large number of requests come in at the same time. Since the cache expires at this time, all requests will eventually go to the database, resulting in a sudden increase in the amount of requests and pressure on the database (data not in the cache , in the database).

solution

  1. Mutual exclusion lock: Among multiple concurrent requests, only the first requesting thread can obtain the lock and perform database query operations. Other threads will block and wait if they cannot obtain the lock. After the first thread writes the data into the cache, Other threads query the cache directly.
  2. Hot data does not expire: directly set the cache to not expire, and then the scheduled task loads data asynchronously and updates the cache.
public String get(String key) throws InterruptedException {
    
    
		String value = jedis.get(key);
		// 缓存过期

		if (value == null){
    
    
			// 设置3分钟超时,防止删除操作失败的时候 下一次缓存不能load db
			Long setnx = jedis.setnx(key + "mutex", "1");
			jedis.pexpire(key + "mutex", 3 * 60);
			// 代表设置成功
			if (setnx == 1){
    
    
				// 数据库查询
				// value = db.get(key);
				// 保存缓存 jedis.setex(key,3*60,"");
				jedis.del(key + "mutex");
				return value; }
			else {
    
    
				// 这个时候代表同时操作的其他线程已经load db并设置缓存了。 需要重新重新获取 缓存
				Thread.sleep(50);
				return get(key);
			}
		}else {
    
    
			return value;
		}
	}

cache avalanche

Cache avalanche means that we set the cache with the same expiration time, causing the cache to fail at a certain moment at the same time, all requests are forwarded to the DB, and the instantaneous pressure on the DB is too heavy to cause an avalanche

Cache gets data from Redis normally

img

cache invalidation

img

Solution

  1. Break up the expiration time: add a random time to the cache expiration time, so that the expiration time of each key is distributed and will not be invalid at the same time.
  2. Hot data does not expire: This method is the same as cache breakdown, and it is also important to consider the refresh time and how to handle database exceptions
  3. Mutex lock: This method is the same as the cache breakdown, and is locked according to the key dimension. For the same key, only one thread is allowed to calculate the calculation result of the first thread, and then directly cache it.
 public Object GetProductListNew(String cacheKey) {
    
    
        int cacheTime = 30;
        String lockKey = cacheKey;
        // 获取key的缓存
		String cacheValue = jedis.get(cacheKey);
		// 缓存未失效返回缓存
		if (cacheValue != null) {
    
    
			return cacheValue; 
		} else {
    
    
			// 加锁
			synchronized(lockKey) {
    
    
				// 获取key的value值
				cacheValue = jedis.get(cacheKey);
				if (cacheValue != null) {
    
    
					return cacheValue;
				} else {
    
    
					//这里一般是sql查询数据db.set(key)
				// 添加缓存
					jedis.set(cacheKey,"");
				}
			}
			return cacheValue;
		}
	}
//加锁排队只是为了减轻数据库的压力,并没有提高系统吞吐量。

Guess you like

Origin blog.csdn.net/qq_43545600/article/details/126877695