Redis overview
Redis is an open source written in ANSI C, contains a variety of data structures, supports network, memory-based, optional persistent key-value pair storage database.
characteristic:
- Based on memory operation, high performance
- Support distributed, theoretically unlimited expansion
- key-value storage system
- Open source written in ANSI C language, complies with BSD protocol, supports network, can be memory-based or persistent log type, Key-Value database, and provides APIs in multiple languages
NoSQL (NoSQL = Not Only SQL), which means "not just SQL", generally refers to non-relational databases. With the rise of Internet web2.0 websites, traditional relational databases have become incapable of coping with ultra-large-scale and high-concurrency purely dynamic websites, exposing many insurmountable problems.
structured and unstructured data
- Structured data refers to data that is logically expressed and realized by a two-dimensional table structure, strictly follows the data format and length specifications, and is also called row data.
- Unstructured data refers to irregular or incomplete data structure, without any predefined data model, data that is inconvenient to be represented by two-dimensional logical tables, such as office documents (Word), text, pictures, HTML, various Class reports, video and audio, etc.
Four categories of NoSQL
KV type NoSql (redis)
As the name suggests, KV-type NoSql is a non-relational database stored in the form of key-value pairs. The biggest advantage of KV-type NoSql is high performance. Using the BenchMark that comes with Redis for benchmarking, the TPS can reach 100,000 levels, and the performance is very strong.
- Data is based on memory, with high read and write efficiency
- KV type data, the time complexity is O(1), and the query speed is fast
Columnar NoSql (HBase)
Columnar NoSql is one of the most representative technologies in the era of big data, represented by HBase.
- Only the specified columns will be read when querying, not all columns will be read
- Column data is organized together, and one disk IO can read one column of data into memory at one time
Document NoSql (MongoDB)
Document-type NoSql refers to a NoSql that stores semi-structured data as documents. Document-type NoSql usually stores data in JSON or XML format. Relational databases are stored in a column for each field step by step. In MongDB, it is stored as a JSON string
Search-type NoSql (ElasticSearch)
Traditional relational databases mainly use indexes to achieve the purpose of fast query, but in the context of full-text search, indexes are powerless. Firstly, like query cannot meet all fuzzy matching requirements, and secondly, the usage restrictions are too large and improper use may cause slow Query, search-type NoSql was born to solve the problem of weak full-text search capabilities of relational databases. ElasticSearch is a representative product of search-type NoSql
The difference between relational and non-relational
Relational
advantage:
- Easy to maintain: all use the table structure and the format is consistent;
- Easy to use: SQL language is common and can be used for complex queries;
- Complex operations: Support SQL, which can be used for very complex queries between one table and multiple tables.
shortcoming:
- The reading and writing performance is relatively poor, especially the high-efficiency reading and writing of massive data;
- Fixed table structure, less flexibility
non-relational
The most typical data structure of a relational database is a table, a data organization composed of two-dimensional tables and their connections
advantage:
- Flexible format: The format of stored data can be in the form of key, value, document, picture, etc. It is flexible to use and has a wide range of application scenarios, while relational databases only support basic types.
- Fast speed: nosql can use hard disk or random access memory as a carrier, while relational database can only use hard disk;
- High scalability;
- Low cost: nosql databases are easy to deploy and are basically open source software.
shortcoming:
- Does not provide sql support, and the cost of learning and using is relatively high;
- no transactions;
- The data structure is relatively complex, and the complex query is slightly lacking
Getting Started with Redis
Redis is a dictionary-structured storage server. A Redis instance provides multiple dictionaries for storing data, and the client can specify which dictionary to store the data in. Redis supports 16 databases by default. You can modify this value by adjusting the databases in the Redis configuration file redis/redis.conf. After setting, restart Redis to complete the configuration.
Does Redis use multi-thread or single-thread?
Because Redis is a memory-based operation, the CPU is not the bottleneck of Redis. The bottleneck of Redis is most likely the size of the machine memory or the network bandwidth. Since single-threading is easy to implement, and the CPU will not become a bottleneck, it is logical to adopt a single-threaded solution. redis uses network IO multiplexing technology to ensure high throughput of the system when there are multiple connections
IO multiplexing technology
Redis data types
String
String is the most basic type of Redis, and a key corresponds to a value. String is binary safe, meaning that String can contain
Any data, such as a serialized object or an image. String can hold up to 512M data.
scenes to be used:
- value can be a number in addition to a string.
- counter
- Count the number of multiple units
- Number of fans
- object cache storage
- distributed lock
List
List is simply a list of strings, sorted in insertion order, adding an element to the head (left) and tail (right) of the list. The bottom layer is a doubly linked list, which has extremely high operating performance on both ends, and the performance of nodes in the middle through index operations is poor.
A List can contain up to 2 to the power of 32 - 1 element (more than 4 billion elements per list)
scenes to be used:
- message queue
- leaderboard
- latest list
SET
The function is similar to List, but Set is automatically rearranged. If you store a list of data and do not want duplicate data, Set is a good choice. Set is an unordered collection of String type. Its underlying layer is a hash table whose value is null, so the time complexity of adding, deleting, and searching is O(1).
scenes to be used:
- Black and white list
- random display
- friends
- follow people
- fan
- collection of interested people
Hash
Hash is a collection of key-value pairs. Hash is a String type Field (field) and Value (value). Hash is especially suitable for storing objects.
Hash structure optimization:
- If the number of fields is small, the storage structure is optimized as a type array
- If the number of fields is large, the storage structure is optimized to the HashMap structure
scenes to be used:
- shopping cart
- storage object
ZSet
ZSet is very similar to Set. It is a String collection without repeated elements. The difference is that each element of ZSet is associated with a score (Score), which is used to sort the set from low score to high score. The elements of the set are unique, but the fractions can be repeated.
scenes to be used:
- delay queue
- leaderboard
- Limiting
BitMaps
BitMaps itself is not a data structure, it is actually a string, but it can operate on the bits of the string. "Rational use of bits can effectively improve memory usage and development efficiency."
scenes to be used:
- active days
- Check-in days
- Login days
- user sign in
- Count active users
- Count whether the user is online
- Implement Bloom filter
Geospatia
GEO, Geographic, geographic information, this type is the two-dimensional coordinates of the element, which is the latitude and longitude on the map. Based on this type, Redis provides common operations such as longitude and latitude setting, query, range query, distance query, and longitude and latitude Hash.
scenes to be used:
- cinema nearby
- nearby friends
- The nearest hot pot restaurant
Redis configuration file
units
Configure the size unit, define the basic measurement unit at the beginning, only support bytes, case insensitive
INCLUDES
Redis has only one configuration file. If multiple people develop and maintain, then multiple such configuration files are needed. At this time, multiple configuration files can be configured here through include /path/to/local.conf, while the original redis The .conf configuration file acts as a master gate.
NETWORK
parameter:
- bind: bind the redis server network card IP, the default is 127.0.0.1, which is the local loopback address. In this case, access to the redis service can only be connected through a local client, not through a remote connection. If the bind option is empty, then all connections from available network interfaces will be accepted.
- port: Specify the port on which redis runs, the default is 6379. Since Redis is a single-threaded model, the port will be modified when multiple Redis processes are opened on a single machine.
- timeout: Set the timeout time when the client connects, in seconds. When the client does not issue any instructions within this period, the connection is closed. The default value is 0, which means not closed.
- tcp-keepalive: The unit is seconds, which means that it will periodically use SO_KEEPALIVE to detect whether the client is still in a healthy state, so as to avoid the server from being blocked all the time. The official suggested value is 300s. If it is set to 0, it will not be detected periodically
GENERAL
Detailed explanation:
- daemonize: set to yes to specify that Redis starts as a daemon process (background start). The default value is no
- pidfile: Configure the PID file path. When redis is running as a daemon process, it will write the pid to the /var/redis/run/redis_6379.pid file by default.
- loglevel : Defines the log level. The default value is notice, and there are 4 values as follows:
- debug (record a large amount of log information, suitable for development and test phases)
- verbose (more log information)
- notice (appropriate amount of log information, used in production environment)
- warning (only some important and key information will be recorded)
- logfile: Configure the log file address, which is printed on the window of the command line terminal by default
- databases: Set the number of databases. The default database is DB 0, you can use the select command on each connection to select a different database, dbid is a value between 0 and databases - 1. The default value is 16, which means that Redis has 16 databases by default.
SNAPSHOTTING
parameter:
- save: This is used to configure the persistence conditions that trigger Redis, that is, when to save the data in the memory to the hard disk
- save 900 1: If at least one key value changes within 900 seconds, save
- save 300 10: Indicates that if at least 10 key values change within 300 seconds, save
- save 60 10000: means if at least 10000 key values change within 60 seconds, save
REPLICATION
parameter:
- slave-serve-stale-data: The default value is yes. When a slave loses contact with the master, or when replication is in progress, the slave may behave in two ways:
- If yes, the slave will still respond to client requests, but the returned data may be outdated, or the data may be empty at the time of the first synchronization
- If it is no, when you execute other commands except info he salveof, the slave will return a "SYNC with master in progress" error
- slave-read-only: Configure whether the Slave instance of Redis accepts write operations, that is, whether the Slave is a read-only Redis. The default value is yes.
- repl-diskless-sync: whether to use the diskless copy function for master-slave data replication. The default value is no.
- repl-diskless-sync-delay: When no hard disk backup is enabled, the server will wait for a period of time before transmitting the RDB file to the slave station through the socket. This waiting time is configurable.
- repl-disable-tcp-nodelay: Whether to disable TCP_NODELAY on the slave after synchronization If you choose yes, redis will use a smaller amount of TCP packets and bandwidth to send data to the slave.
CLIENTS
parameter:
maxclients : Set the maximum number of concurrent client connections. The default is unlimited. The number of client connections that Redis can open at the same time is the maximum file that the Redis process can open. The number of descriptors -32 (the redis server itself will use some), if you set maxclients to 0. Indicates no restriction. When the number of client connections reaches the limit, Redis will close the new connection and return the max number of clients reached error message to the client
MEMORY MANAGEMENT
parameter:
-
maxmemory: Set the maximum memory of Redis, if set to 0. Indicates no restriction. It is usually used together with the maxmemory-policy parameter described below.
-
maxmemory-policy : When the memory usage reaches the maximum value set by maxmemory, the memory clearing policy used by redis. There are several options:
-
- 1) volatile-lru uses the LRU algorithm to remove the key with the expiration time set (LRU: recently used Least Recently
- Used )
- 2) allkeys-lru uses the LRU algorithm to remove any key
- 3) volatile-random removes the random key whose expiration time has been set
- 4) allkeys-random remove random keys
- 5) volatile-ttl removes the key that is about to expire (minor TTL)
- 6) noeviction noeviction does not remove any key, just returns a write error, the default option
-
maxmemory-samples : LRU and minimal TTL algorithms are not precise algorithms, but relatively precise algorithms (in order to save memory). You can choose the sample size for inspection at will. Redis selects 3 samples for inspection by default, and you can set the number of samples through maxmemory-samples.
APPEND ONLY MODE
parameter:
- appendonly: By default, redis uses the rdb method for persistence, which is sufficient for many applications. However, if redis goes down midway, it may cause a few minutes of data loss. According to the save strategy for persistence, Append Only File is another persistence method that can provide better persistence features. Redis will write each written data into the appendonly.aof file after receiving it, and each time it starts, Redis will first read the data of this file into the memory, and ignore the RDB file first. The default value is no.
- appendfilename: aof file name, the default is "appendonly.aof"
- appendfsync: configuration of aof persistence strategy; no means not to execute fsync, and the operating system ensures that the data is synchronized to the disk, with the fastest speed; always means to execute fsync every write to ensure that the data is synchronized to the disk; everysec means to execute every second One fsync may cause the loss of this 1s data
LUA SCRIPTING
lua-time-limit: The maximum execution time of a lua script, in ms. The default value is 5000.
REDIS CLUSTER
parameter:
- cluster-enabled: Cluster switch, the cluster mode is not enabled by default.
- cluster-config-file: The name of the cluster configuration file.
- cluster-node-timeout : The configurable value is 15000. Threshold of node interconnection timeout, cluster node timeout milliseconds
- cluster-slave-validity-factor : A value of 10 can be configured.
SECURITY
requirepass: Set the redis connection password.
Redis publish and subscribe
Redis publish-subscribe (pub/sub) is a message communication mode: the sender (pub) sends the message, and the subscriber (sub) receives the message.
//订阅语法格式
subcribe 主题名字
//示例:
127.0.0.1:6379> SUBSCRIBE channel-1
Reading messages... (press Ctrl-C to quit)
1) "subscribe"
2) "channel-1"
3) (integer) 1
//发布命令
publish channel-1 hello
//示例:(打开另一个客户端,给channel1发布消息hello)
127.0.0.1:6379> PUBLISH channel-1 hello
(integer) 1 //返回的1是订阅者数量。
//打开第一个客户端可以看到发送的消息
127.0.0.1:6379> SUBSCRIBE channel-1
Reading messages... (press Ctrl-C to quit)
1) "subscribe"
2) "channel-1"
3) (integer) 1
1) "message"
2) "channel-1"
3) "hello"
//发布的消息没有持久化,如果在订阅的客户端收不到hello,只能收到订阅后发布的消息。
Redis slow query
The process of Redis command execution
- Slow queries occur in phase 3
- Client timeouts are not necessarily slow queries, but slow queries are a possible factor in client timeouts
- The slow query log is stored in the Redis memory list
slow query log
The slow query log is the log that the Redis server calculates the execution time of each command before and after the command is executed, and records it when it exceeds a certain threshold. The log records the time when the slow query occurred, as well as the execution time, specific commands and other information, which can be used to help development and operation and maintenance personnel locate the slow query that exists in the system.
Get slow query log
You can use the slowlog get command to obtain slow query logs. You can also add a number after slowlog get to specify the number of slow query logs to be obtained. For example, to obtain 3 slow query logs:
Configure slow query log
- Specifies the threshold slowlog-log-slower-than for command execution time. The function of slowlog-log-slower-than is to specify the threshold of the execution time of the command. When the execution time of the command exceeds this threshold, it will be recorded.
- The number of slow query logs stored slowlog-max-len. The role of slowlog-max-len is to specify the maximum number of entries stored in the slow query log. In fact, Redis uses a list to store slow query logs, and slowlog-max-len is the maximum length of the list.
View slow query log
127.0.0.1:0>config get slow*
1) "slowlog-max-len"
2) "128"
3) "slowlog-log-slower-than"
4) "10000"
10000阈值,单位微秒,此处为10毫秒,128慢日志记录保存数量的阈值,此处保存128条。
Modify the Redis configuration file
For example, set slowlog-log-slower-than to 1000 and slowlog-max-len to 1200:
- slowlog-log-slower-than 1000
- slowlog-max-len 1200
practical advice
slowlog-max-len**** configuration suggestion
- It is recommended to increase the size of the slow query list online . When recording slow queries, Redis will truncate long commands and will not take up a lot of memory.
- Increasing the list of slow queries can reduce the possibility of slow queries being eliminated . For example, it can be set to more than 1000 online.
slowlog-log-slower-than**** configuration suggestion
- If the default value exceeds 10 milliseconds, it is judged as a slow query, and the value needs to be adjusted according to the Redis concurrency .
- Since Redis uses a single thread to respond to commands, for high-traffic scenarios, if the command execution time is more than 1 millisecond, then Redis can support up to less than 1000 OPS. Therefore, the Redis recommendation for high OPS scenarios is set to 1 millisecond .
pipeline
Experienced 1 pipeline (n commands) = 1 network time + n command time, which greatly reduces the overhead of network time, this is the stream
waterline.
A network command communication model
Elapsed time = one network time + one command time
Batch network command communication model
Experienced n times = n times network time + n times command time
//引入jedis依赖包:
<dependency>
<groupId>redis.clients</groupId>
<artifactId>jedis</artifactId>
<version>2.9.0</version>
</dependency
//没有pipeline命令
Jedis jedis - new Jedis("127.0.0.1",6379);
for ( int i = 0 ; i < 10000 ; i ++ ){
jedis.hset("hashkey:" + i , "field" + i , "value" + i);
}
/**
在不使用pipeline的情况下,使用for循环进行每次一条命令的执行操作,耗费的时间可能达到 1w
条插入命令的耗时为50s
*/
//使用pipeline
Jedis jedis = new Jedis("127.0.0.1",6379);
for ( int i = 0; i < 100 ; i++) {
Pipeline pipeline = jedis.ppipelined();
for (int j = i * 100 ; j < (i + 1) * 100 ; j++) {
pipeline.hset("hashkey:" + j,"field" + j, "value" + j);
}
pipeline.syncAndReturnAll();
}
Redis persistence mechanism
Since all redis data is stored in memory, if persistence is not configured, all data will be lost after restarting Redis, so restarting Redis requires enabling the persistence function. Save the data to the disk, and when Redis restarts, the data can be restored from the disk. "For Redis, the persistence mechanism refers to storing the data in the memory as a hard disk file, so that when Redis restarts or the server fails, the data can be restored according to the persistent hard disk file."
The significance of redis persistence lies in failure recovery . For example, a redis is deployed as a cache, and it can also save some important data.
Redis provides two different forms of persistence
RDB persistence is a full backup, which is time-consuming, so Redis provides a more efficient AOF (Append Only-file) persistence solution. Briefly describe its working principle: AOF logs store Redis server command sequences, AOF only records instruction records that modify memory.
- RDB(Redis DataBase)
- AOF(Append Only File)
RDB persistence mechanism
RDB: It refers to writing the snapshot of the data set in the memory to the disk within a specified time interval, and reading the snapshot directly into the memory when restoring. "snapshot - a compressed binary".
configure dump.rdb
The file saved by RDB, configure the file name in redis.conf, the default is dump.rdb. (The storage location of the rdb file can also be modified. By default, it is in the directory where the command line is located when Redis starts.)
The save path of the rdb file can also be modified. The default is the directory where the command line is located when Redis starts
Trigger mechanism - three main ways
RDB configuration
Snapshot default configuration:
- save 3600 1: If at least one key value changes within 3600 seconds (one hour), save it.
- save 300 100: If at least 100 key values change within 300 seconds (five minutes), save.
- save 60 10000: If at least 10000 key values change within 60 seconds, save.
Configure new save rules
Add a new snapshot policy to redis.conf. If there are 5 key changes within 30 seconds, the snapshot will be triggered. After the configuration is modified, a restart is required
Redis service.
- save 3600 1
- save 300 100
- save 60 10000
- save 30 5
flushall
Executing the flushall command will also trigger the rdb rule.
save and bgsave
There are two commands to manually trigger Redis for RDB persistence:
\1. save This command will block the current Redis server. During the execution of the save command, Redis cannot process other commands until the RDB process is completed. It is not recommended to use it.
\2. When bgsave executes this command, Redis will perform snapshot operations asynchronously in the background, and snapshots can also respond to client requests.
advanced configuration
stop-writes-on-bgsave-error
The default value is yes. When Redis cannot write to the disk, directly close the write operation of Redis.
rdbcompression
The default value is yes. For snapshots stored on disk, you can set whether to perform compressed storage. If so, redis will use LZF calculation
method to compress. If you don't want to consume CPU for compression, you can set it to turn off this function, but the snapshot stored on disk
will be larger.
rdbchecksum
The default value is yes. After storing the snapshot, we can also let redis use the CRC64 algorithm for data verification, but this will increase
About 10% performance consumption, if you want to get the biggest performance improvement, you can turn off this function.
Data recovery
Just put the rdb file in the Redis startup directory, and when Redis starts, it will automatically load dump.rdb and restore the data.
Advantages and disadvantages
Suitable for large-scale data recovery
It is more suitable to use if the requirements for data integrity and consistency are not high
save disk space
fast recovery
Make a backup at a certain interval during the backup cycle, so if Redis goes down unexpectedly, the last snapshot will be lost
All subsequent modifications.
AOF persistence mechanism
Record each write operation in the form of a log, and record all write instructions executed by Redis. AOF is not enabled by default. You can configure the file name in redis.conf. The default is appendonly.aof.
The save path of the AOF file is consistent with the path of the RDB. If AOF and RDB are started at the same time, Redis reads the data of the AOF by default.
according to.
AOF start/repair/recovery
Set Yes: Modify the default appendonly no to yes
AOF synchronization frequency setting
parameter:
\1. appendfsync always is always synchronized, every Redis write will be recorded in the log immediately, the performance is poor but the data integrity is better.
\2. appendfsync everysec Sync every second, and record it in the log once every second. If there is a downtime, the data of this second may be lost.
\3. appendfsync no redis does not take the initiative to synchronize, and hand over the timing of synchronization to the operating system.
Advantages and disadvantages
advantage
The backup mechanism is more robust and the probability of data loss is lower.
Readable log text, robust through operation AOF, can handle misoperation.
shortcoming
Occupies more disk space than RDB.
Restoring backups is slower.
If every read and write is synchronized, there will be a certain performance pressure.
Choose Persistence
Don't just use ****RDB
RDB data snapshot files are generated every 5 minutes or longer. At this time, you have to accept that once the redis process goes down,
Then the data of the last 5 minutes will be lost.
Also don't just use ****AOF
\1. You use AOF for cold backup, without RDB for cold backup, and the recovery speed is faster.
\2. RDB simply and rudely generates data snapshots each time, which is more robust and can avoid the bugs of complex backup and recovery mechanisms such as AOF.
Comprehensive use of AOF and RDB two persistence mechanisms
Use AOF to ensure that data is not lost. As the first choice for data recovery, use RDB to do different degrees of cold backup, and all files will be lost in AOF
Or when it is damaged and unavailable, RDB can also be used for fast data recovery.
Advantages of RDBs:
1. Smaller size: the same amount of data rdb data is smaller than aof, because rdb is a compact file
2. Faster recovery: Because rdb is a snapshot of data, it is basically a copy of data, without re-reading and writing to memory
3. Higher performance: the parent process only needs to fork a child process when saving the RDB, and there is no need for the parent process to perform other io operations, which also ensures the performance of the server.
shortcoming:
1. Fault loss: Because rdb is full, usually use shell script to realize rdb backup to redis in 30 minutes or 1 hour or every day (note, you can also use the built-in strategy), but it takes at least 5 minutes One-time backup, so when the service dies, at least 5 minutes of data will be lost.
2. Poor durability: Compared with aof's asynchronous strategy, because the replication of rdb is full, even if the sub-process of fork is used for backup, the disk consumption cannot be ignored when the amount of data is large, especially when accessing When the volume is high, the fork time will also be extended, resulting in tight CPU and relatively poor durability.
Advantages of aof
1. Data guarantee: You can set the fsync policy. Generally, the default is everysec. You can also set every write to append, so even if the service dies, you will lose at most one second of data
2. Automatic shrinking: When the size of the aof file reaches a certain level, the background will automatically perform aof rewriting. This process will not affect the main process. After the rewriting is completed, the new writing will be written to the new aof , the old one will be deleted. However, if this article is compared with rdb, there is no need to count it as an advantage, but the official website shows it as an advantage.
shortcoming:
1. Relatively poor performance: its operating mode determines that it will consume redis performance
2. The volume is relatively larger: Although the aof file is rewritten, there is still a big difference between the operation process and the operation result after all, and the volume is also larger.
Redis transaction
The essence of a Redis transaction is a collection of commands. Transactions support executing multiple commands at a time, and all commands in a transaction will be serialized. During the transaction execution process, the commands in the queue will be serialized in order, and the command requests submitted by other clients will not be inserted into the transaction execution command sequence.
A redis transaction is a one-time, sequential, and exclusive execution of a series of commands in a queue.
Redis transactions have no concept of isolation level
Batch operations are put into the queue cache before the EXEC command is sent, and will not be actually executed, so there is no query in the transaction to see the update in the transaction, but the query outside the transaction cannot.
Redis does not guarantee atomicity
In Redis, a single command is executed atomically, but transactions do not guarantee atomicity and there is no rollback. If any command in the transaction fails to execute, the rest of the commands will still be executed.
Three phases of Redis transaction
- start business
- command enqueue
- Execute business
Redis transaction related commands
watch key1 key2 ... : 监视一或多个key,如果在事务执行之前,
被监视的key被其他命令改动,则事务被打断 ( 类似乐观锁 )
multi : 标记一个事务块的开始( queued )
exec : 执行所有事务块的命令 ( 一旦执行exec后,之前加的监控锁都会被取消掉 )
discard : 取消事务,放弃事务块中的所有命令
unwatch : 取消watch对所有key的监控
Atomicity, Consistency, Isolation, and Durability
- Atomicity: All operations in a transaction are either completed or not completed, and will not end in a certain link in the middle. If an error occurs during the execution of the transaction, it will be restored (Rollback) to the state before the transaction started, as if the transaction had never been executed.
- Consistency: The integrity of the database is not violated before the transaction begins and after the transaction ends. This means that the written data must fully comply with all preset rules, including data accuracy, seriality, and the subsequent database can spontaneously complete the scheduled work.
- Isolation: The ability of the database to allow multiple concurrent transactions to read, write and modify its data at the same time. Isolation can prevent data inconsistency caused by cross-execution when multiple transactions are executed concurrently. Transaction isolation is divided into different levels, including read uncommitted (Read uncommitted), read committed (read committed), repeatable read (repeatable read) and serialization (Serializable).
- Persistence: After the transaction processing ends, the modification to the data is permanent, even if the system fails, it will not be lost.
master-slave replication
Why use master-slave replication?
Machine failure: Deploy a Redis, when the server fails, it needs to be migrated to another server and data synchronization must be ensured.
Capacity bottleneck: When there is a need to expand the redis memory, 16G->64G, a single machine generally cannot meet it.
Solution
Copy multiple copies of data to other nodes for replication to achieve high availability of Redis and redundant backup of data to ensure high availability of data and services
Master-slave replication: refers to copying the data of one redis server to other redis servers. The former is called the master node (master), and the latter is called the slave node (slave). from the master node to the slave node.
master-slave replication
- Data redundancy: Realize hot backup of data, a data redundancy method other than persistence.
- Fault recovery: When there is a problem with the master node, the slave node can provide services to achieve rapid fault recovery; (actually, it is a kind of service redundancy)
- Load balancing: On the basis of master-slave replication, combined with read-write separation, the master node can provide write services, and the slave nodes can provide read services. (that is, the application connects to the master node when writing redis data, and the application connects to the slave node when reading redis data), sharing the server load;
- High availability: master-slave replication can also be the basis for sentinels and clusters to be implemented.
Sentinel Mode: An anti-customer-based automatic version that can automatically monitor whether the master fails. If there is a failure, one of the slaves will be selected as the master according to the number of votes.
Master-slave replication principle
Master-slave replication is divided into three stages and six processes
three phases
- connection building phase
- Data Synchronization Phase
- Command Propagation Phase
six processes
- Save master node (master) information
- Master-slave establishment: The slave node (slave) maintains the replication-related logic through a scheduled task that runs every second. When the scheduled task finds that there is a new master node, it will try to establish a network connection with the node. The slave node will establish a socket socket, and the slave node will establish a socket with port 51234 to receive the copy command sent by the master node.
- Send ping command: After the connection is successfully established, the slave node sends a ping request for the first communication
-
- Check whether the network socket is available between master and slave
- Check whether the master node can currently receive commands
- Authority verification: If the master node sets the requirepass parameter, password verification is required, and the slave node must configure the masterauth parameter to ensure the same password as the master node to pass the verification; if the verification fails, the replication will terminate, and the slave node will restart the replication process.
- Synchronize data sets: After the master-slave replication connection communicates normally, for the first replication scenario, the master node will send all the data it holds to the slave node. This part of the operation is the longest step
-
- When the master-slave is just connected, full synchronization is performed; after full synchronization, incremental synchronization is performed. The slave can initiate full synchronization at any time. The redis strategy is that no matter what the situation is, it will first try to perform incremental synchronization. If it fails, the slave is required to perform full synchronization.
- Command continuous replication
-
- When the master node synchronizes the current data to the slave node, the establishment process of replication is completed. Next, the master node will continue to send write commands to the slave nodes to ensure the consistency of master-slave data.
Sentinel monitoring
When the master of the host is down, it needs to be manually switched to re-select the master node
master-slave switching
When the master server goes down, a slave server needs to be switched to the master server, which requires manual intervention, which is not only troublesome, but also causes the service to be unavailable for a period of time. All have sentinel mode.
sentinel mode
Sentry mode is a special mode. Redis provides sentinel commands. Sentry is an independent process that can run independently as a process. The principle is that Sentinel monitors multiple running Redis instances by sending commands and waiting for the Redis server to respond.
Sentinel role
- Cluster monitoring: responsible for monitoring whether the redis master and slave processes are working normally
- Message notification: If a redis instance fails, the sentinel is responsible for sending a message as an alarm notification to the administrator
- Failover: If the master node hangs up, it will automatically transfer to the slave node
- Configuration Center: If a failover occurs, notify the client of the new master address
working principle
monitoring stage
- sentinel (sentinel 1)----->Initiate info to master (master) and slave (slave) to get full information.
- sentinel (sentinel 2) -----> send info to the master (master), you will know the information of the existing sentinel (sentinel 1), and connect to slave (slave).
- sentinel (sentinel 2) -----> initiate a subscribe to sentinel (sentinel 1).
notification phase
Sentinel continuously initiates notifications to the master and slave to collect information.
failover phase
In the notification phase, if the notification sent by sentinel does not get a response from the master, it will mark the master as SRI_S_DOWN, and send the status of the master to each sentinel. When other sentinels hear that the master has hung up, they say I don’t believe it. The results are shared with each sentinel. When half of the sentinels think that the master is down, they will mark the master as SRI_0_DOWN.
voting method
The sentinel who receives the election notice first will vote for it
failover
- The master-slave nodes of the sentinel system are no different from ordinary master-slave nodes. Fault discovery and transfer are controlled and completed with the sentinel.
- Sentinel nodes are essentially redis nodes.
- Each sentinel node only needs to configure the monitoring master node, and then it can automatically discover other sentinel nodes and slave nodes.
- During the sentinel node startup and failover phase, the configuration files of each node will be written from.
Sentry Mode Disadvantages
- When the master hangs up, sentinel will elect a new master. Redis cannot be accessed during the election, and there will be a momentary disconnection.
- In sentinel mode, only the master node can write externally, and the slave node can only be used for reading. Although a single redis node supports a QPS of up to 10W, the pressure to write data is on the master during festivals.
- The Redis single-node memory cannot be set too large. If the data is too large, it will become very slow during master-slave synchronization, and it will take a long time when the node starts.
Cluster Cluster mode
Redis cluster has three modes (master-slave mode, Sentinel mode, Cluster mode). Redis cluster is a distributed service cluster composed of multiple master-slave node groups. It has the characteristics of replication, high availability and fragmentation.
advantage
- There are multiple masters, which can reduce the impact of access transient problems
- There are multiple masters, which can provide higher concurrency
- It can be stored in slices, and more data can be stored
Redis cluster requires at least three nodes
principle
Redis Cluster divides all data into 16384 slots (slots), and each node is responsible for a part of the slots. The slot information is stored in each node, only the master node will be allocated slots, and the slave nodes will not be allocated slots.
Slot positioning algorithm: k1=127001
By default, Cluster will use the crc16 algorithm to hash the key value to get an integer value, and then use this integer value to modulo 16384 to get the specific slot
HASH_SLOT = CRC16(key)%16384
Redis split-brain
The split brain of Redis cluster means that due to network problems, the Redis Master node is in a different network partition from the Redis slave node and Sentinel cluster. At this time, because the Sentinel cluster cannot perceive the existence of the master, the slave node is promoted to the master node.
At this time, there are two different master nodes, just like one brain is split into two. In the cluster brain split problem, if the client continues to write data based on the original master node, the new master node will not be able to synchronize the data. When the network problem is solved, the sentinel cluster will downgrade the original master node to a slave node, and then synchronize data from the new master.
//redis.conf
min-replicas-to-write 1
min-replicas-max-lag 5
//第一个参数表示最少的slave节点为1个
//第二个参数表示数据复制和同步的延迟不能超过5秒
//配置了这两个参数:如果发生脑裂原Master会在客户端写入操作的时候拒绝请求。这样可以避免大量数据丢失。
Redis cache warm up
There is no cached data in the newly started system, and the system performance and database load are not very good during the cache reconstruction process, so it is best to load the cached hotspot data into the cache before the system goes online. Preloading is cache warming. Cache preheating solves the downtime problem after database streaking
cache cold start
There is no data in the cache. Since there is no data in the cold start of the cache, if the service is provided directly to the outside world, MySQL will hang up when the concurrency increases.
Solution
- Fill some data into redis in advance, and then provide services.
- If the amount of data is very large, it cannot be said that all the data is written to redis, because the amount of data is too large, it will take a lot of time, and redis cannot accommodate so much data
- It is necessary to count the hot data with high access frequency in real time according to the specific access situation of the day, and then write these hot data into redis
- In the case of a large amount of hot data, multiple services can be used to read and write data in parallel, and parallel distributed cache preheating
cache penetration
Cache penetration refers to the data that does not exist in the cache or the database, but users continue to initiate requests, such as data with an id of "-1" or data with a particularly large id. At this time, the user request is likely to be an attacker, and the attack will cause excessive pressure on the database.
**Operation process:** The data queried by the user does not exist in the database, and naturally there will be no cache, so when the user queries, the data cannot be queried in the cache, and the user must go to the database to query again every time. Then returns null (equivalent to making two useless queries). In this way, the request bypasses the cache and directly queries the database (cache hit rate problem).
Solution
- Cache empty values: If the data returned by the request query is empty (whether the data exists or not), the empty result is still cached, and the expiration time of the empty result is set to be shorter, and the longest is no more than 5 minutes.
- Bloom filter: If you want to judge whether an element is in a set, the general idea is to save all the elements in the set, and then determine it by comparison.
bloom filter
Bloom filter is a data structure, a more ingenious probabilistic data structure (probability data structure), characterized by efficient insertion and query. (Bloom filter is a database structure, the bottom layer is a bit array)
//引入hutool包
<dependency>
<groupId>cn.hutool</groupId>
<artifactId>hutool-all</artifactId>
<version>5.7.17</version>
</dependency>
//java代码实现
// 初始化 构造方法的参数大小10 决定了布隆过滤器BitMap的大小
BitMapBloomFilter filter = new BitMapBloomFilter(10); filter.add("123");
filter.add("abc");
filter.add("ddd");
boolean abc = filter.contains("abc");
System.out.println(abc);
cache breakdown
For a certain hot key, when the cache expires, a large number of requests come in at the same time. Since the cache expires at this time, all requests will eventually go to the database, resulting in a sudden increase in the amount of requests and pressure on the database (data not in the cache , in the database).
solution
- Mutual exclusion lock: Among multiple concurrent requests, only the first requesting thread can obtain the lock and perform database query operations. Other threads will block and wait if they cannot obtain the lock. After the first thread writes the data into the cache, Other threads query the cache directly.
- Hot data does not expire: directly set the cache to not expire, and then the scheduled task loads data asynchronously and updates the cache.
public String get(String key) throws InterruptedException {
String value = jedis.get(key);
// 缓存过期
if (value == null){
// 设置3分钟超时,防止删除操作失败的时候 下一次缓存不能load db
Long setnx = jedis.setnx(key + "mutex", "1");
jedis.pexpire(key + "mutex", 3 * 60);
// 代表设置成功
if (setnx == 1){
// 数据库查询
// value = db.get(key);
// 保存缓存 jedis.setex(key,3*60,"");
jedis.del(key + "mutex");
return value; }
else {
// 这个时候代表同时操作的其他线程已经load db并设置缓存了。 需要重新重新获取 缓存
Thread.sleep(50);
return get(key);
}
}else {
return value;
}
}
cache avalanche
Cache avalanche means that we set the cache with the same expiration time, causing the cache to fail at a certain moment at the same time, all requests are forwarded to the DB, and the instantaneous pressure on the DB is too heavy to cause an avalanche
Cache gets data from Redis normally
cache invalidation
Solution
- Break up the expiration time: add a random time to the cache expiration time, so that the expiration time of each key is distributed and will not be invalid at the same time.
- Hot data does not expire: This method is the same as cache breakdown, and it is also important to consider the refresh time and how to handle database exceptions
- Mutex lock: This method is the same as the cache breakdown, and is locked according to the key dimension. For the same key, only one thread is allowed to calculate the calculation result of the first thread, and then directly cache it.
public Object GetProductListNew(String cacheKey) {
int cacheTime = 30;
String lockKey = cacheKey;
// 获取key的缓存
String cacheValue = jedis.get(cacheKey);
// 缓存未失效返回缓存
if (cacheValue != null) {
return cacheValue;
} else {
// 加锁
synchronized(lockKey) {
// 获取key的value值
cacheValue = jedis.get(cacheKey);
if (cacheValue != null) {
return cacheValue;
} else {
//这里一般是sql查询数据db.set(key)
// 添加缓存
jedis.set(cacheKey,"");
}
}
return cacheValue;
}
}
//加锁排队只是为了减轻数据库的压力,并没有提高系统吞吐量。