Redis_NoSQL entry study notes

Introduction to NoSQL

NoSQL = Not Only SQL, which means "not only SQL". Refers to non-relational databases. These types of data stores do not require a fixed schema, and can be scaled out with out-of-order redundant operations.

Features:

  1. Easy to expand: There are many types of NoSQL databases, but a common feature is to remove the relational features of relational databases. There is no relationship between data, so it is very easy to expand. Invisibly, it brings scalable capabilities at the architectural level.

  2. Large data volume and high performance: NoSQL databases have very high read and write performance, especially in the case of large data volumes, and they also perform well. This is due to its non-relational nature and the simple structure of the database. Generally, MySQL uses query calche, and the cache becomes invalid every time the table is updated. The NoSQL cache is record-level, which is a fine-grained cache, so NoSQL has much higher performance at this level.

  3. Diverse and flexible data models: NoSQL does not need to create fields for the data to be stored in advance, and can store customized data formats at any time. In a relational database, adding and deleting fields is a very troublesome thing. If it is a table with a very large amount of data, adding fields is simply a nightmare.

Four categories of NoSQL

  1. KV (key value)
  2. column store database
  3. document database
  4. graph database

CAP and BASE theory of distributed foundation

Centralized and Distributed

1. Centralized
Centralized refers to a central node composed of one or more computers. Data is stored centrally in this central node, and all business units of the entire system are centrally deployed on this central node. All functions of the system are controlled by Centralized processing.

2. Distributed
A distributed system is a system in which hardware or software is distributed on different network computers and communicates and coordinates with each other only through consumption and transfer.

Distributed System Design Theory CAP

ACID is the theory of database transaction integrity, CAP is the theory of distributed system design, and BASE is the extension of AP scheme in CAP theory.

CAP principle, namely:

  1. C: Consistency (strong consistency)

    In a distributed system, if all users can read the latest value after the update operation for a data item is successfully executed, then such a system is considered to have strong consistency.

  2. A: Avaliability

    Availability means that the services provided by the system must be consistently available, and each operation request of the user can always be within a limited time (referring to an operation of the user, the system must be able to return the corresponding processing result within the specified time, if Exceeding this time, the system considers it unavailable) return result (the return result is another important indicator of availability, requiring the system to return a normal response result, failure or success, rather than a confused one after completing the processing of the user request result).

  3. P: Partition tolerance (partition tolerance)
    When a distributed system encounters any network partition failure, it can still provide external services that meet the consistency or availability requirements, unless the entire network environment fails.

Summarize

In a distributed environment, we will find that we must choose the P (partition tolerance) factor, because the network itself cannot be 100% reliable and may fail, so partitioning is an inevitable phenomenon. That is to say, partition fault tolerance is one of the most basic requirements of a distributed system.

In CAP theory, consistency, partition tolerance, and availability cannot be satisfied at the same time, and partition is the basic requirement of a distributed system. Therefore, when designing the architecture, you can only choose between AP or CP, that is, only in consistency Or a trade-off between usability.

In addition, the granularity of CAP theory is the data, not the system, that is, for different data of the same system, AP and CP may be used respectively.

Reference blog: 1. CAP and BASE theory of distributed foundation

BASE theory

The BASE theory is the result of the trade-off between consistency and availability in the CAP theory, allowing the distributed system to satisfy three characteristics, namely:

  1. BA: Basically Available (basically available)
    basically available means that if an unpredictable failure occurs in a distributed system, it is allowed to lose part of the availability, but it does not mean that the entire system is unavailable, that is, it is enough to ensure that the core services of the system are available; for example, the actual During development, if some service failures occur, we can make the response time of the system longer, or limit the flow, reduce consumption, or even downgrade the service, so that a certain service cannot provide service temporarily

  2. S: Soft State (Soft State)
    The soft state means that the data in the system can exist in an intermediate state, and the intermediate state will not affect the availability of the system; in other words, it allows the data synchronization of some nodes in the system to have delay problems. But this delay will not affect the availability of the system

  3. E: Eventual Consistency (final consistency)
    eventual consistency emphasizes that all data copies can eventually reach a state of data consistency after a period of synchronization. The essence of final consistency is that the system needs to ensure the final consistency of data, not strong consistency at all times. But it is also necessary to ensure that the non-consistent window period will not cause harm to system data

Reference blog: [Technical Miscellany] How to correctly understand the CAP and BASE theories?

Redis

Introduction

Redis: REmote DIctionary Server (remote dictionary server), is completely open source and free, written in C language, complies with the BSD protocol, is a high-performance (key/value) distributed memory database, runs based on memory, and supports persistence The NoSQL database is one of the most popular NoSql databases, also known as the data structure server.

common commands

switch database

select 数据库编号

View the number of current database keys

DBsize

List all keys in the current database

keys *

View the type of the specified key

type key

Delete all keys in the library

FLUSHDB

Delete the specified key in the library

DEL key名

Set the expiration time for the key, if the key expires, it will be deleted.

EXPIRE key名 时间

time to live, check the remaining time of the key, if it has expired, return -2, if no expiration time is set, return -1

TTL key名

Five Data Types

One, string (string)

string is the most basic type of redis.

The string type is binary safe. It means that the string of redis can contain any data. Such as jpg images or serialized objects.

Some commonly used commands:

SET name cy
GET name
SET a 1
INCR a # +1
INCRBY a step # +step
GETRANGE str L R # 取出str L-R的子串
SETEX str time value # 创建时即设置过期时间
MSET str1 v1 str2 v2 # 同时创建多个str
MGET str1 str2 # 同时获取多个str的值
SETNX a 1 # 如果不存在则创建

Two, List (list)

Redis lists are simply lists of strings, sorted by insertion order. You can add an element to the head (left) or tail (right) of the list.

His bottom layer is a linked list

Some commonly used commands:

LPUSH list名 v1 v2 v3 # 从左边依次插入,如果不存在列表,则创建
RPUSH list名 v1 v2 v3 # 从右边依次插入,如果不存在列表,则创建
LRANGE list名 L R # 查询指定范围内的值 若L=0  R=-1 则返回整个list 
LPOP list名 # 删除并返回最左边的值
RPOP list名 # 删除并返回最右边的值
LINDEX list名 id # 返回下标为id的元素
LLEN list名 # 返回list长度

If all values ​​are removed, the corresponding key disappears. The time complexity of each operation refers to the linked list.

Three, Set (collection)

Redis's Set is an unordered collection of string types.

The collection is implemented through a hash table, so the complexity of adding, deleting, and searching is O(1).

Some commonly used commands:

SADD st 1 2 3 3 # 向集合内插入值,如果集合不存在,则创建
SMEMBERS key # 查看集合内的所有值
SCARD key # 查看集合元素个数
SREM key value # 删除集合内指定的值
SDIFF key1 key2 # 差集
SINTER key1 key2 # 交集
SUNION key1 key2 # 并集

4. Hash (hash)
Redis hash is a collection of key-value (key=>value) pairs.

Redis hash is a mapping table of string type field and value, and hash is especially suitable for storing objects.

Some commonly used commands:

HSET hs1 k v # 向哈希内插入值,如果集合不存在,则创建
HMSET hs1 k1 v1 k2 v2 # 向哈希内插入多个值
HMGET hs1 k1 k2 k3 # 查看哈希内多个键对应的值
HGETALL hs1 # 查看哈希所有键值,返回形式为k1 v1 k2 v2 ...
HKEYS hs1 # 查看哈希内所有key
HVALS hs1 # 查看哈希内所有value
HLEN hs1 # 查看哈希内键值对的数量

Five, zset (sorted set: ordered set)

Like set, zset is also a collection of string type elements, and duplicate members are not allowed.
The difference is that each element will be associated with a score of type double. Redis uses scores to sort the members of the set from small to large.

The members of zset are unique, but the score (score) can be repeated.

Some commonly used commands:

ZADD zst1 score1 v1 score2 v2 score3 v3 # 向zset插入值,若不存在,则创建
ZRANGE zst1 L R $ 按照从小到大返回第L到第R个值

Persistent RDB

In official terms, RDB snapshots: The RDB persistence scheme is a point-in-time snapshot of your data set at a specified time interval. It is a memory snapshot of the data in the Redis database. It is a binary file (default name: dump.rdb, which can be modified), which stores all the data content in the Redis database when the file is generated. It can be used for data backup, transfer and recovery of Redis.

configuration parameters

The triggering method and running behavior of RDB snapshots are affected by configuration parameters. Open the configuration file redis.conf (windows version is in redis.windows.conf) and view the "SNAPSHOTTING" chapter to understand the parameters and functions of RDB snapshots.

  1. save <seconds> <changes>
    The save parameter is the trigger strategy for Redis to trigger automatic backup, seconds is the statistical time (unit: second), and changes is the number of writes that occurred within the statistical time. save mn means: a snapshot is triggered when there are n writes within m seconds, that is, a backup is performed once. The save parameter can be configured in multiple groups to meet the backup requirements under different conditions. If you need to turn off the automatic backup policy of RDB, you can use save "". The following are descriptions of several configurations:

    save 900 1:表示900秒(15分钟)内至少有1个key的值发生变化,则执行
    
    save 300 10:表示300秒(5分钟)内至少有1个key的值发生变化,则执行
    
    save 60 10000:表示60秒(1分钟)内至少有10000个key的值发生变化,则执行
    
    save "": 该配置将会关闭RDB方式的持久化
    
  2. dbfilename: The name of the snapshot file, the default is dump.rdb.

  3. dir: The snapshot file save directory, which is the same directory as the current configuration file by default.

  4. stop-writes-on-bgsave-error: The default value is yes. If yes, Redis will stop receiving data when RDB is enabled and the last background save data fails. This will make the user aware that the data is not properly persisted to disk, otherwise no one will notice that a disaster has happened. If Redis restarts, it can start receiving data again.

  5. rdbcompression: The default value is yes. Whether to use the LZF algorithm to compress string objects when backing up data. It is enabled by default, which saves storage space, but consumes part of the CPU when generating backup files.

  6. rdbchecksum: Whether to use CRC64 check when saving and loading rdb files, which is enabled by default. Enabling this parameter can make rdb files more secure and improve stability, but there will be a certain loss of performance (about 10%). If the rdb file was created without a checksum, the checksum will be set to 0, telling Redis to skip the checksum.

Persistence process

There are two methods for completing RDB persistence in Redis: rdbSave and rdbSaveBackground (in the source code file rdb.c). Let’s briefly talk about the differences between the two:

rdbSave: It is executed synchronously, and the persistence process will be started immediately after the method is called. Since Redis is a single-threaded model, it will be blocked during the persistence process, and Redis cannot provide external services;
rdbSaveBackground: It is executed in the background (asynchronously), and this method will fork out the child process. The real persistence process is executed in the child process , the main process will continue to provide services;

Reference: https://segmentfault.com/a/1190000039208707

Persistence of AOF

AOF is not enabled by default in redis. How to enable it: open the configuration file redis.confand modify the parameters: appendonly yes

AOF ( append only file ) persistence records each write command in an independent log, and re-executes the commands in the AOF file when Redis restarts to achieve the purpose of data recovery. The main function of AOF is to solve the real-time nature of data persistence.


As shown in the figure above, the implementation of AOF persistence function can be divided into command append ( append ), file writing ( write ), file synchronization ( sync ), file rewriting (rewrite) and restart loading (load). The process is as follows:

  • All write commands will be appended to the AOF buffer.
  • The AOF buffer is synchronized to the hard disk according to the corresponding strategy.
  • As the AOF file becomes larger and larger, it is necessary to rewrite the AOF file periodically to achieve the purpose of compression.
  • When Redis restarts, the AOF file can be loaded for data recovery.

File writing and synchronization

Every time before Redis ends an event loop, it will call flushAppendOnlyFilethe function to determine whether it is necessary to write and synchronize the contents of the AOF buffer to the AOF file.

flushAppendOnlyFileThe behavior of the function is determined by the value of the appendfsync option in the redis.conf configuration. This option has three optional values, namely always, everysecand no:

  1. always: Redis writes all the contents of the AOF buffer to the AOF file in each event loop and synchronizes the AOF file, so the efficiency alwaysof is appendfsyncthe worst among the three values ​​of the option, but in terms of security, it is also the safest. When a downtime occurs, AOF persistence will only lose the command data generated in one event loop.

  2. everysec: Redis writes all the content in the AOF buffer to the AOF file in each event cycle, and synchronizes the AOF file in the child thread every second. From the efficiency point of view, this mode is fast enough. When a downtime occurs, only one second of command data is lost.

  3. no: Redis writes all the content in the AOF buffer to the AOF file in each event loop. The synchronization of AOF files is controlled by the operating system. This mode is the fastest, but the synchronization interval is longer, and more data may be lost in the event of a failure.

Rewriting of AOF files

AOF writes log files, so the occupied space will become larger and larger, and the space can be reduced by equivalent rewriting.

In the redis.conf file, the following can be found:

auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb

When the file size is greater than auto-aof-rewrite-min-sizeand the current file size is 100% larger than the file after the latest rewrite auto-aof-rewrite-percentage, rewrite will be triggered automatically.

Note: The actual rewriting work of AOF is carried out for the current value of the database, and the program neither reads, writes nor uses the original AOF file.

References: AOF

affairs

Redis transaction command:

discard # 取消事务,放弃执行事务块内的所有命令。
exec # 执行所有事务块内的命令。
multi # 标记一个事务块的开始。
unwatch # 取消watch命令对所有key的监视。
watch key # 监视一个(或多个)key,如果在事务执行之前这个key被其他命令所改动,那么事务将会被打断。

Five situations:

  1. normal execution

    127.0.0.1:6379> multi
    OK
    127.0.0.1:6379> set a 1
    QUEUED
    127.0.0.1:6379> incrby a 3
    QUEUED
    127.0.0.1:6379> get a
    QUEUED
    127.0.0.1:6379> exec
    1) OK
    2) (integer) 4
    3) "4"
    
  2. Abandon the transaction: discard

    127.0.0.1:6379> multi
    OK
    127.0.0.1:6379> set a 100
    QUEUED
    127.0.0.1:6379> discard
    OK
    
  3. All together: If there is a compilation error, the entire transaction will not be executed

    127.0.0.1:6379> multi
    OK
    127.0.0.1:6379> set a 10
    QUEUED
    127.0.0.1:6379> ssset b 10
    (error) ERR unknown command 'ssset'
    127.0.0.1:6379> exec
    (error) EXECABORT Transaction discarded because of previous errors.
    127.0.0.1:6379>
    
  4. Grievance creditor: If there is a running error, only the wrong statement will not be executed

    127.0.0.1:6379> multi
    OK
    127.0.0.1:6379> set name mike
    QUEUED
    127.0.0.1:6379> incr name
    QUEUED
    127.0.0.1:6379> set name2 john
    QUEUED
    127.0.0.1:6379> exec
    1) OK
    2) (error) ERR value is not an integer or out of range
    3) OK
    127.0.0.1:6379> get name
    "mike"
    127.0.0.1:6379> get name2
    "john"
    
  5. watch monitoring

    127.0.0.1:6379> set balance 100
    OK
    127.0.0.1:6379> set debt 0
    OK
    127.0.0.1:6379> watch balance debt
    OK127.0.0.1:6379> set balance 199999
    【OK】
    127.0.0.1:6379> multi
    OK
    127.0.0.1:6379> decrby balance 10
    QUEUED
    127.0.0.1:6379> incrby debt 10
    QUEUED
    127.0.0.1:6379> exec
    1) (integer) 90
    2) (integer) 10
    

summary:

  1. The watch instruction is similar to optimistic locking.
  2. All monitoring locks will be cleared after exec.

master-slave replication

Establish a master-slave relationship:

slaveof 【masterip】 【masterport】

View server roles:

info replication

End the connection to the host:

slaveof no one 

The slave machine can only read, not write.

replication principle

Slave will send a sync command after successfully connecting to the master. Master receives the command to start the background saving process, and collects all received commands for modifying the data set. After the background process is executed, the master will transfer the entire data file to slave to complete a full synchronization.

Full copy: After the slave service receives the database file data, it saves it and loads it into the memory.
Incremental replication: Master continues to pass all the new collected modification commands to the slave one by one to complete the synchronization. The first time you
connect to the master, use full replication, and then use incremental replication. But as long as the master is reconnected, a full synchronization (full copy) will be performed automatically.

sentinel mode

As the name suggests, the sentinel is here to stand guard for the Redis cluster. Once a problem is found, it can deal with it accordingly. Its functions include

  1. Monitor whether the master and slave are running normally
  2. When the master fails, it can automatically convert a slave to the master (the elder brother hangs up, choose a younger brother to take over)

Reference materials: Master the three cluster modes of Redis master-slave replication, sentinel and Cluster in one article

Guess you like

Origin blog.csdn.net/hesorchen/article/details/122805518