Summary of Knowledge System (7) MySQL and Redis

Article Directory

MySQL

basic concept

1. What are the three paradigms of the database?

  • The first normal form: Emphasizes the atomicity of the column, that is, each column of the database table is an indivisible atomic data item.
  • Second normal form: The attributes of the entity are completely dependent on the primary key.
  • Third Normal Form: Any non-key attribute does not depend on other non-key attributes.

2. What is the process of MySQL execution?

insert image description here

  1. Connect to the MySQL connector via TCP. The client sends a connection request to the MySQL connector through a TCP connection, and the connector will perform permission verification and connection resource allocation on the request.
  2. Query cache. (MySQL will first look up the cached data in the query cache. The cached data is saved as key-value, and the key is SQL, value, and SQL query result. If the cache hits, MySQL will not parse the query statement, but directly return the value to the client end). If the cache is not hit, the execution will continue. After the execution is completed, the query result will be stored in the cache. Since the cache hit rate of MySQL is relatively low, starting from version 8.0 of MySQL, the stage of query cache is directly deleted.
  3. Parse SQL. The MySQL parser performs lexical analysis on SQL statements , identifies keywords, builds an SQL syntax tree, and then performs grammatical analysis to determine whether the SQL is legal.
  4. Execute SQL. The SQL execution process includes
  • prepare, the preprocessing stage. It mainly completes the detection of whether the queried table or field exists, and replaces the symbols in select * with all the columns in the table.
  • optimize, optimization stage. The optimizer optimizes the execution plan of the SQL query statement based on the consideration of the query cost.
  • execute, the execution phase. According to the execution plan generated by the optimizer, records are read from the storage engine and returned to the client.

3. B/B+ tree?

Because the database reads data from the disk, it is time-consuming. In order to improve data access efficiency, it is necessary to reduce the level of data access as much as possible.
insert image description here

  1. B-Tree (multi-way balanced search tree) is
    a B-tree with a maximum degree of 5 (5th order), and each node can store up to 4 keys (5 data ranges, consisting of 4 data values, so each B-tree The number of data of a node is 1 less than the number of pointers (branches), 5 pointers (5 branches).
    insert image description here
  2. B+ tree: A variant of the tree, all elements will appear on the leaf nodes, non-leaf node data only serve as an index, and all leaf nodes will form a one-way linked list.
    insert image description here
  3. MySQL's B+ tree further optimizes the classic B+ tree. On the basis of the original B+ tree, a linked list pointer pointing to the adjacent leaf nodes is added to form a B+ tree with sequential pointers to provide interval access performance .
    insert image description here

4. How is a row of MySQL records stored?

Take MySQL's default storage engine InnoDB as an example:
every time we create a database, /var/lib/mysql/a directory named database will be created in the directory, and then the files that save the table structure and table data will be stored in this directory. The directory contains:

  • db.opt: ​​store the current database default character set and character verification rules .
  • t_order.frm: store table structure definition information.
  • t_order.ibd: Store table data information . Starting from MySQL 5.6.6, the data of each table in MySQL is stored in an independent .ibd file, called an exclusive tablespace file .

A tablespace consists of segments, extents, pages, and rows.

insert image description here

  1. Row - row: each record is stored by row
  2. Page - page: Records are stored in rows, but the database is read in units of pages . The default size of each page is 16KB, that is, a maximum of 16KB of continuous storage space is guaranteed.
  3. Extent: InnoDB storage engine uses B+ tree to organize data. In B+ tree, the data of each row is linked through a doubly linked list. If the storage space is allocated in units of pages, the physical location between two adjacent pages in the linked list It is not continuous, resulting in a large amount of random I/O during disk query, resulting in low performance. In order to solve the problem of random I/O, the physical positions of adjacent pages in the linked list are also adjacent, so that sequential I/O can be used . Specifically, when the amount of data in the table is large, when allocating space for an index, it is no longer allocated in units of pages, but in units of extents. The size of each area is 1MB. For 16KB pages, 64 consecutive pages will be divided into one area, so that the physical positions of adjacent pages in the linked list are also adjacent, and sequential I/O can be used. up.
  4. Segment: The table space is composed of various segments, generally divided into:
    • Index segment: a collection of areas that store non-leaf nodes of the B+ tree
    • Data segment: a collection of areas that store the leaf nodes of the B+ tree
    • Rollback segment: A collection of areas that store rollback data.

The data of each row in MySQL is usually stored in Cmpact row format:
insert image description herea complete record is divided into:

  • extra information
    • 1. Variable-length field length list: only appears when the data table has variable-length fields, and is used to store the actual length of the variable-length field data.
    • 2. NULL value list: it only appears in tables with fields that allow null. The NULL value list uses binary bits to indicate whether the data is empty. The value of the bit is 1, which means it is null, and 0, which means it is not empty. It must be represented by the whole byte (8 bits), and if it is insufficient, fill in 0 in the high bit.
    • 3. Record header information, including:
      • delete_mask: Indicates whether this piece of data is deleted.
      • next_record: The next record position.
      • record_type: record type, 0 means normal record, 1 means B+ tree non-leaf node, 2 means minimum record, 3 means maximum record.
  • real data
    • 1. Hidden fields
      • row_id: If no primary key is set, row_id is generated as the primary key.
      • trx_id: transaction id
      • roll_pointer: rollback pointer, record the pointer of the previous version.
    • 2. Data field: common column field.

index

5. Index classification

  • By 「数据结构」classification: B+tree index, Hash index, Full-text index.

  • By 「物理存储」classification: clustered index (primary key index), secondary index (auxiliary index).

    • The leaf nodes of the B+Tree indexed by the primary key (clustering) store actual data, and all complete user records are stored in the leaf nodes of the B+Tree indexed by the primary key;
    • The leaf nodes of the B+Tree of the secondary (auxiliary) index store the primary key value, not the actual data.
  • By 「字段特性」classification: primary key index, unique index, common index, prefix index.

    • Primary key index: Created on the primary key field , a table can have at most one, null values ​​are not allowed, and are usually created together when the table is created.
    • Unique index: built on the UNIQUE field , a table can have multiple unique indexes, the value of the index column must be unique, but null values ​​are allowed.
    • Ordinary index: built on ordinary fields .
    • Prefix index: An index built on the first few characters of a character type field, rather than an index built on the entire field. A prefix index can be built on a column whose field type is char, varchar, binary, or varbinary .
  • Classified by "number of fields": single-column index, joint index.

    • By combining multiple fields into an index, the index is called a composite index.

6. The leftmost matching principle of the joint index

For example, to combine the product_no and name fields in the product table into a joint index (product_no, name), the way to create a joint index is as follows:CREATE INDEX index_product_no_name ON product(product_no, name);

The non-leaf nodes of the B+Tree of the joint index (product_no, name) use the values ​​of the two fields as the key value of the B+Tree. When querying data, first compare by the product_no field, and then compare by the name field when the product_no is the same . Therefore, when using a joint index, there is a leftmost matching principle, that is, index matching is performed in a leftmost-first manner. When using a joint index for query, if you do not follow the **"leftmost matching principle"**, the joint index will fail, so you can't take advantage of the fast query feature of the index.

insert image description here
For example, if you create a (a, b, c) joint index, if the query conditions are the following, you can match the joint index (because of the query optimizer, the order of the a field in the where clause is not important ):

where a=1;
where a=1 and b=2 and c=3;
where a=1 and b=2;

However, if the query conditions are as follows, because the leftmost matching principle is not met, the joint index cannot be matched, and the joint index will become invalid:

where b=2;
where c=3;
where b=2 and c=3;

6.1. Index pushdown of joint index

When using a joint index, if there are a large number of situations that need to be returned to the table, index pushdown can help optimize, and filter conditions are processed at the engine layer

For example: use a user table tuser, and create a joint index (name, age) in the table. insert image description here
If there is a requirement now: retrieve all users whose first name in the table is Zhang and whose age is 10 years old. Then, the SQL statement is written like this:

select * from tuser where name like '张%' and age=10;

Using index pushdown, the found name field starting with Zhang will be filtered according to age = 10 to get all matching ids, and then return to the table to get all the results.

If index pushdown is not used, it will be searched according to the first condition and returned to the server layer, and the server layer will filter out according to the second condition.

The pushdown of index pushdown actually refers to handing over some of the things that the upper layer (service layer) is responsible for to the lower layer (engine layer) for processing.

7. Range query of joint index

Q1: select * from t_table where a > 1 and b = 2, which field of the joint index (a, b) uses the B+Tree of the joint index?

Q1 In this query statement, only the a field uses the joint index for index query, but the b field does not use the joint index, because the values ​​of the b field are out of order in the scope of the secondary index records that meet the a > 1 condition .

Q2: select * from t_table where a >= 1 and b = 2, which field of the joint index (a, b) uses the B+Tree of the joint index?

For the range of secondary index records that match a = 1, the values ​​of the b field are "ordered", so the combined index of the b field will be used.

Q3: SELECT * FROM t_table WHERE a BETWEEN 2 AND 8 AND b = 2, which field of the joint index (a, b) uses the B+Tree of the joint index?

Q4: SELECT * FROM t_user WHERE name like 'j%' and age = 22, which field of the joint index (name, age) uses the B+Tree of the joint index?

Q3 and Q4 are the same as Q2.

联合索引的最左匹配原则,在遇到范围查询(如 >、<)的时候,就会停止匹配,也就是范围查询的字段可以用到联合索引,但是在范围查询字段的后面的字段无法用到联合索引。注意,对于 >=、<=、BETWEEN、like 前缀匹配的范围查询,并不会停止匹配。

8. When do you need/do not need to create an index?

The biggest advantage of indexing is to improve query speed, but indexing also has disadvantages, such as: increasing storage overhead and increasing index maintenance costs.

When should indexes be used?

  • Fields with unique restrictions , such as commodity codes;
  • A field often used in WHERE query conditions, which can improve the query speed of the entire table. If the query condition is not a field, a joint index can be established.
  • It is often used for GROUP BY and ORDER BY fields, so that there is no need to sort again when querying , because we all know that the records in B+Tree are all sorted after the index is established.

When do you not need to create an index?

  • There is a lot of duplicate data in the field
  • Table data is too small
  • Frequently updated fields do not need to create indexes

9. Is there any way to optimize the index?

  1. Prefix index optimization: It can reduce the size of the index field, but it also has limitations, such as order byprefix index cannot be used, and prefix index cannot be used as a covering index.
  2. By building a joint index, a covering index. Reduce back-to-table queries.
  3. The primary key is best auto-incremented to reduce page splits.
  4. It is best to set the index to be non-empty, and the presence of null values ​​will make the optimizer more complicated when making index selection.
  5. Prevent index invalidation.

10. Under what circumstances will the index become invalid?

  1. Using != or <> invalidates the index.
  2. Using left or left fuzzy matching on the index means that like %xxeither or like %xx%both of these methods will cause the index to fail. Because the index B+ tree is stored in order according to the "index value", it can only be compared according to the prefix.
  3. Use function or expression calculations on indexes.
  4. Implicit type conversions for indexes
  5. Joint index non-leftmost match
  6. For the or in the where clause, if the condition before the or is an index column, but the condition column after the or is not an index column, the index will be invalid.

11. Which count has the best performance?

count() is an aggregation function. The parameter of the function can be not only the field name, but also any other expression. The function of count() is to count the records that meet the query conditions, how many are specified by the function .参数不为 NULL 的记录

1 This expression is a simple number, it is never NULL, so count(1) is the number of all records in the statistics table. So there are:
count(1) = count(*)
insert image description here
count(1), count(*), count (primary key field) when executing, if there is a secondary index in the table, the optimizer will select the secondary index for scanning.

Therefore, if you want to execute count(1), count(*), and count (primary key fields), try to create a secondary index on the data table, so that the optimizer will automatically use the secondary index with the smallest key_len for scanning, compared to Scanning the primary key index will be more efficient.

affairs

12. What are the four characteristics of business?

  • Atomicity: A transaction is an indivisible minimum unit of operation, either all succeed or all fail.
  • Consistency: When a transaction completes, all data must be made consistent.
  • Isolation: The isolation mechanism provided by the database system ensures that transactions run in an independent environment that is not affected by external concurrent operations.
  • Persistence: Once a transaction is committed or rolled back, its changes to the data in the database are permanent.

13. What technology does the InnoDB engine use to ensure the characteristics of transactions?

Redo log executes the log again, that is, writes the changed log first when the data changes. To ensure the persistence of data, even if there is an error in sending, it can be executed again according to the redo log.

undo log Rollback log, used to record the information before the data is modified.

  • Atomicity: Atomicity is guaranteed by undo log (rollback log). If the transaction fails halfway, it will be rolled back according to undo log.
  • Persistence: Persistence is guaranteed through the redo log (re-execution log). If the data modification fails, the modification is performed again according to the redo log.
  • Isolation: Through MVCC (multi-version concurrency control) and lock mechanism, it is guaranteed that transactions will not be affected by concurrency during execution.
  • Consistency: Ensure the consistency of transaction execution results through atomicity, persistence, and isolation.

14. What problems will be caused by transaction concurrency?

  • Dirty read: A transaction reads data that has not been committed by another transaction.
  • Non-repeatable read: A transaction reads the same record successively, but the data read twice is different, which is called non-repeatable read.
  • Phantom reading: When a transaction queries data according to conditions, there is no corresponding data row, but when inserting data, it is found that this row of data already exists, as if a "phantom" has appeared.

15. What are the isolation levels of transactions?

When multiple transactions are executed concurrently, the phenomenon of "dirty read, non-repeatable read, and phantom read" may be encountered. The severity of these three phenomena is ranked as follows:

insert image description here

  • Read uncommitted (read uncommitted), which means that when a transaction has not been committed, the changes it makes can be seen by other transactions;
  • Read committed (read committed), which means that after a transaction is committed, the changes it makes can be seen by other transactions;
  • Repeatable read refers to the data seen during the execution of a transaction, which is always consistent with the data seen when the transaction is started. The default isolation level of the MySQL InnoDB engine;
  • Serializable: A read-write lock will be added to the record. When multiple transactions read and write this record, if a read-write conflict occurs, the later accessed transaction must wait for the previous transaction to complete , to continue execution;

For different isolation levels, the phenomena that may occur during concurrent transactions are also different:
insert image description here

16. Introduce the transaction log of MySQL.

insert image description here

  1. Redo log: The redo log is not written with the commit of the transaction, but starts to be written into redo during the execution of the transaction. At the point in time to prevent a failure, there are still dirty pages that have not been written to disk. When the MySQL service is restarted, it will be redone according to the redo log, so as to achieve the feature of persisting the transaction data that has not entered the disk.

  2. undo log: undo log is used to roll back row records to a certain version. Before the transaction is committed, Undo saves the uncommitted version data, and the data in the undo log can be used as a snapshot of the old version of the data for other concurrent transactions to read. It is a product that appears to realize the atomicity of transactions, and is used to realize multi-version concurrency control in the MySQL innodb storage engine.

  3. bin log. MySQL's binlog is a binary log that records all database table structure changes (such as CREATE, ALTER TABLE) and table data modifications (INSERT, UPDATE, DELETE). Binlog does not record operations such as SELECT and SHOW, because such operations do not modify the data itself. MySQL binlog is recorded in the form of events, and also includes the time consumed by statement execution . MySQL's binary log is transaction-safe. The main purpose of the binlog is replication and recovery.

17. What is MVCC?

The implementation of MVCC (multi-version concurrency control) is realized by saving a snapshot of the data at a certain point in time. Depending on the start time of the transaction, each transaction may see different data for the same table at the same time.

The principle of MVCC implementation is based on the hidden fields in the MySQL table: row_id (hidden primary key id), trx_id (transaction id) and roll_ptr (rollback pointer). The rollback pointer points to the previous version of this record and is recorded in the undo log. According to the transaction id and the rollback pointer, you can locate the version chain of the undo log and read the view (ReadView) of the corresponding version.

Version access rules:
MVCC maintains a ReadView structure, which mainly includes the transaction list TRX_IDs {TRX_ID_1, TRX_ID_2, …} not submitted by the current system, and the minimum values ​​TRX_ID_MIN and TRX_ID_MAX of the list. During the SELECT operation, judge whether the data row snapshot can be used according to the relationship between TRX_ID, TRX_ID_MIN and TRX_ID_MAX of the data row snapshot:

  • TRX_ID < TRX_ID_MIN, indicating that the data row snapshot was changed before all current uncommitted transactions, so it can be used.
  • TRX_ID > TRX_ID_MAX, indicating that the data row snapshot was changed after the transaction was started, so it cannot be used.
  • TRX_ID_MIN <= TRX_ID <= TRX_ID_MAX, need to judge according to the isolation level:
    • Commit read: If TRX_ID is in the TRX_IDs list, it means that the transaction corresponding to the data row snapshot has not been committed, and the snapshot cannot be used. Otherwise, it has been submitted and can be used.
    • Repeatable read: Neither can be used.

Snapshot read and current read
1. Snapshot read: The select in MVCC operates on the data in the snapshot and does not need to be locked.
2. Current reading: The operation of modifying data in MVCC (addition, deletion and modification) needs to be locked to read the latest data.

18. How are transaction isolation levels implemented?

  • For transactions at the "read uncommitted" isolation level, since the data modified by uncommitted transactions can be read, it is good to read the latest data directly;
  • For transactions at the "serialization" isolation level, parallel access is avoided by adding read-write locks;
  • For transactions at the "read committed" and "repeatable read" isolation levels, they are implemented through Read View, and their difference lies in the timing of creating Read View.
    • The repeatable read isolation level is to generate a Read View when a transaction is started, and then use this Read View during the entire transaction.
    • The read commit isolation level is to generate a new Read View every time data is read.

19. How to solve phantom reading?

The repeatable read isolation level (default isolation level) of the MySQL InnoDB engine proposes solutions to avoid phantom reads according to different query methods:

  • For snapshot reads (ordinary select statements), phantom reads are resolved through MVCC. Under the repeatable read isolation level, the data seen during transaction execution is always consistent with the data seen when the transaction is started. Even if a piece of data is inserted by other transactions in the middle, the data cannot be queried. So it's good to avoid the phantom read problem.
    insert image description here

  • For the current read (select ... for update and other statements), the next-key lock (record lock + gap lock) is used to solve the phantom read, because when the select ... for update statement is executed, the next-key lock will be added, If another transaction inserts a record within the range of the next-key lock, the insert statement will be blocked and cannot be successfully inserted, so it is very good to avoid the problem of phantom reading.

insert image description here

Two examples of phantom reading scenarios are given.

The first example:
time1: Transaction A queries the record with id 5, but finds that it cannot be queried.
tme2: Transaction B inserts a record with id 5.
time3: Transaction B commits the transaction.
time4: When transaction A inserts the record with id 5, it finds that there is already a record, and a phantom read occurs.

insert image description here

The second example:
time1: Transaction A first executes the "snapshot read statement": select * from t_test where id > 100 and gets 3 records.
time2: transaction B inserts a record with id= 200 and submits it;
time3: transaction A executes the "current read statement" select * from t_test where id > 100 for update and then gets 4 records, phantom reading also occurs at this time Phenomenon.

The MySQL repeatable read isolation level does not completely solve phantom reading, but largely avoids the occurrence of phantom reading.

Lock

20. What kind of locks does MySQL have?

  1. Global lock: Global lock is mainly used for logical backup of the whole database , so that during the backup of the database, the data in the backup file will not be different from the expected one due to the update of the data or table structure. While locked, the database is read-only.

  2. Table-level lock: lock the entire table.
    insert image description here

    • Table lock:
      • Table shared read lock: client 1 executes lock table score read;
        the table score and it will become read-only. At this time, if client 1 wants to add, delete, or modify the table score, an error will be reported. If client 2 wants to add, delete, or modify the table score, it will enter the waiting state and wait for client 1. After unlocking, the modification statement of client 2 will be submitted.
      • Table exclusive write lock: client 1 executes: lock tables score write;
        client 1 will exclusively read and write the table score, and other clients will be blocked from reading and writing.
    • Metadata lock (MDL): The locking process of metadata lock is automatically controlled by the system, and it does not need to be used explicitly. It is automatically added when accessing a table.
      • When performing CRUD operations on a table, MDL read locks (shared) are added;
      • When changing the table structure, the write lock (exclusive) in the metadata lock is added.
      • Read locks and read locks are not mutually exclusive, so multiple threads can operate on the same table. The read-write locks and the write locks are mutually exclusive to ensure the security of the operation of changing the table structure. Therefore, if two threads want to add fields to a table at the same time, one of them will not start executing until the other finishes executing.
    • Intention locks: 意向锁的目的是为了快速判断表里是否有记录被加锁。Intention shared locks and intention exclusive locks are table-level locks, which will not conflict with row-level shared locks and exclusive locks, and there will be no conflicts between intention locks, only with shared table locks (lock tables ... read) and exclusive table locks (lock tables ... write) conflict.
  3. Row-level locks: Row-level locks operate on the corresponding row data each time, with the lowest probability of conflict and the highest degree of concurrency. InnoDB indexes are organized based on indexes. Row locks are implemented by locking the indexes on the index instead of The lock added to the record.

    • Row lock (record lock, record lock): The lock added to each record can be divided into:
      • Shared lock (S, read): Allows a transaction to read a row, prevents other transactions from acquiring exclusive locks, read-read sharing, and read-write mutual exclusion.
      • Exclusive lock (X, write): Allow transactions that acquire exclusive locks to update data, and prevent other transactions from acquiring shared and exclusive locks, that is, read-write mutual exclusion and write-write mutual exclusion.
      • When performing additions, deletions, and modifications, an exclusive lock is automatically added. When executing select, no locks are added by default, but locks can be manually added through statements.
    • Gap lock (Gap lock): only exists in the repeatable read isolation level, the purpose is to solve the phenomenon of phantom reading under the repeatable read isolation level.

insert image description here
* Next-Key Lock: It is a combination of Record Lock + Gap Lock, which locks a range and locks the record itself.

21. How does MySQL add row-level locks?

The row-level lock locking rules are more complicated, and the locking forms are different in different scenarios.

The object of locking is the index . The basic unit of locking is next-key lock , which is composed of record lock and gap lock . Interval, adjacent key lock next key, which is equivalent to adding the right boundary of the locked interval on the basis of the lock range of the gap lock, that is, next-key, corresponding to the record lock

However, next-key locks degenerate into record locks or gap locks in some scenarios.

What is the scene? To sum up,在能使用记录锁或者间隙锁就能避免幻读现象的场景下, next-key lock 就会退化成记录锁或间隙锁。

  1. Unique index equivalent query.
    • When the queried record "exists", after locating this record on the index tree, the next-key lock in the index of the record will degenerate into a "record lock".
    • When the query record is "non-existent", after the index tree finds the first record larger than the query record, the next-key lock in the index of the record will degenerate into a "gap lock".
  2. Unique index range query
    • For the range query of "greater than or greater than or equal to", because there is an equivalent query condition, if the record of the equivalent query exists in the table, the next-key lock in the index of the record will degenerate into a record lock.
    • For the range query of "less than or less than or equal to", it depends on whether the record of the condition value exists in the table:
      • When the record of the condition value is not in the table, no matter it is a range query with the condition of "less than" or "less than or equal to", when the record that terminates the range query is scanned, the next-key lock of the index of the record will degenerate into a gap lock, and other For the scanned records, next-key locks are added to the indexes of these records.
      • When the record of the condition value is in the table, if it is a range query of the "less than" condition, when the record that terminates the range query is scanned, the next-key lock of the index of the record will degenerate into a gap lock, and other scanned records, all It is to add next-key locks on the indexes of these records; if the range query of the "less than or equal to" condition is scanned, the index next-key lock of the record will not degenerate into a gap lock when the record that terminates the range query is scanned. For other scanned records, next-key locks are added to the indexes of these records.
  3. Non-unique index equivalent query: When using a non-unique index for equivalent query, because there are two indexes, one is the primary key index and the other is the non-unique index (secondary index), so when locking, both All indexes are locked, but when the primary key index is locked, only the records that meet the query conditions will lock their primary key index.
    • When the queried record "exists", since it is not a unique index, there must be a record with the same index value, so the non-unique index equivalent query process is a scanning process until the first unqualified secondary is scanned The index record stops scanning, and then during the scanning process, a next-key lock is added to the scanned secondary index record, and for the first secondary index record that does not meet the conditions, the next-key lock of the secondary index Key locks degenerate into gap locks. At the same time, record locks are added to the primary key indexes of the records that meet the query conditions.
    • When the queried record "does not exist", the first unqualified secondary index record is scanned, and the next-key lock of the secondary index will degenerate into a gap lock. Because there is no record that meets the query condition, the primary key index will not be locked.
  4. Non-unique index range query: For non-unique index range query, the next-key lock of the index will not degenerate into gap locks and record locks , that is, non-unique index range queries are all added with adjacent key locks.
  5. Query without index: If the read query statement is locked, the index column is not used as the query condition, or the query statement does not use the index query, the scan is a full table scan. Then, a next-key lock will be added to the index of each record, which is equivalent to a locked full table. At this time, if other transactions add, delete, or modify the table, they will be blocked.

因此,在线上在执行 update、delete、select ... for update 等具有加锁性质的语句,一定要检查语句是否走了索引,如果是全表扫描的话,会对每一个索引加 next-key 锁,相当于把整个表锁住了,这是挺严重的问题。

22. Under what circumstances will MySQL cause a deadlock? How to deal with it?

insert image description here
time1 stage: transaction A plus gap lock, range (20, 30)
time2 stage: transaction B plus gap lock, range (20, 30)

间隙锁的意义只在于阻止区间被插入,因此是可以共存的。A gap lock acquired by one transaction will not prevent another transaction from acquiring a gap lock in the same range of gaps. There is no difference between shared and exclusive gap locks. They do not conflict with each other and have the same function, that is, two transactions can simultaneously hold the same gap lock. Gap lock for common gaps.

Time3 stage: Transaction A tries to add an intention lock to the position with id 25, but finds that transaction B has set a gap lock between (20,30), the lock fails, blocks, and waits for transaction B to release the gap lock.
Time4 stage: Transaction B tries to add an intention lock to the position with id 26, but finds that transaction A has set a gap lock between (20,30), the lock fails, blocks, and waits for transaction A to release the gap lock.

Transaction A and transaction B wait for each other to release the lock, which meets the four conditions of deadlock: mutual exclusion, possession and waiting, non-occupancy, and circular waiting, so a deadlock occurs.

log

23. What are the functions of undo log, redo log, and bin log? what is the difference?

  1. The undo log has two main functions:
  • Record the state before the transaction is executed, realize transaction rollback, and ensure the atomicity of the transaction. During transaction processing, if an error occurs or the user executes a ROLLBACK statement, MySQL can use the historical data in the undo log to restore the data to the state before the transaction started.
  • One of the key factors to realize MVCC (multi-version concurrency control). MVCC is implemented through ReadView + undo log. The undo log saves multiple copies of historical data for each record. When MySQL executes a snapshot read (ordinary select statement), it will follow the version chain of the undo log to find records that satisfy its visibility according to the information in the Read View of the transaction.
  1. The redo log has two main functions:
  • Record the status of the end of transaction execution to ensure the durability of the transaction. If a crash occurs after the transaction is committed, the execution of the transaction can be resumed through the redo log after restarting.
  • The writing method of the redo log uses an append operation. The redo log is not directly written to the disk, but is first written to the redo log buffer, and then the redo log in the redo log buffer is sequentially written to the disk. The timing of flushing the disk includes ( MySQL shuts down normally] The redo log buffer write volume is greater than half of the capacity, and the InnoDB background thread will persist the redo log buffer to the disk every 1 second, and each time a transaction is submitted, you can also choose the parameter to drop the redo log buffer to the disk), set The write operation of MySQL has changed from [random write] to sequential write on the disk, which improves execution performance.
  1. Bin log is mainly used for backup recovery and master-slave recovery. After MySQL completes an update operation, the server layer will also generate a binlog. The binlog file is a log that records all database table structure changes and table data modifications, and does not record query operations.

undo log and redo log

undo log记录的是事务提交前的数据状态,redo log记录的是事务提交之后的数据状态

redo log gives bin log

1. The applicable objects are different: binlog is the log implemented by the server layer
of MySQL , which can be used by all storage engines; redo log is the log implemented by the Innodb storage engine ;

2. The writing method is different:
binlog is append writing. When a file is full, a new file will be created to continue writing. The previous log will not be overwritten, and the full log will be saved.
The redo log is cyclically written, and the size of the log space is fixed (the redo log buff has a fixed size). When it is fully written, it starts from the beginning, and what is saved is the dirty page log that has not been flushed to the disk.

3. Different purposes:
binlog is used for backup and recovery, master-slave replication;
redo log is used for recovery from failures such as power failure.

24. Why do you need a buffer pool?

The Innodb storage engine designs a buffer pool (Buffer Pool) to improve the read and write performance of the database. When MySQL starts, InnoDB will apply for a continuous memory space for Buffer Pool, and then divide pages one by one according to the default size of 16KB . The pages in Buffer Pool are called cache pages.

  • When reading data, if the data exists in the Buffer Pool, the client will directly read the data in the Buffer Pool, otherwise it will read it from the disk.
  • When modifying data, if the data exists in the Buffer Pool, directly modify the page where the data in the Buffer Pool is located, and then set its page as a dirty page (the memory data of this page is inconsistent with the data on the disk), in order to reduce disk I/O does not write dirty pages to disk immediately, and then the background thread chooses an appropriate time to write dirty pages to disk.

In addition to caching "index pages" and "data pages", Buffer Pool also includes Undo pages , insert caches, adaptive hash indexes, lock information, and more.

25. Why is two-phase commit required, what is the commit process like, and what are the problems?

After the transaction is committed, both the redo log and the binlog are persisted to the disk, but these two are independent logics, and there may be a semi-successful state, that is, only one of the redo log and the bin log succeeds. The redo log will determine the data status of the master database, and the bin log will determine the data status of the slave database. Only one of the redo log and the bin log is successfully flushed, which will lead to the inconsistency between the master and the slave.

In order to solve this problem, MySQL internally completes the submission of transaction X in two stages: prepare and commit.
insert image description here

  • Prepare phase: write XID (transaction ID) to redo log, set the transaction status corresponding to redo log to prepare, and then persist redo log to disk
  • Commit phase: write the XID to the binlog, then persist the binlog to the disk, then call the engine’s commit transaction interface, and set the redo log status to commit.
    insert image description here
    Under the mechanism of two-phase commit, you can decide whether to roll back the transaction or commit the transaction by comparing whether there is a transaction id recorded in the redo log in the bin log. If there is, the transaction is committed, and if not, the transaction is rolled back. In this way, the consistency of the two logs, redo log and binlog, can be guaranteed.

Although two-phase commit solves the log consistency problem, it also has problems:

  • High number of disk I/Os: For the "double 1" configuration, each transaction submission will perform two fsync (disk flushing), one is redo log flushing, and the other is binlog flushing.
  • Intense lock competition: Although the two-phase commit can ensure that the contents of the two logs of the "single transaction" are consistent, in the case of "multi-transaction", it cannot guarantee that the commit order of the two is consistent. Therefore, the process basis of the two-phase commit In addition, a lock needs to be added to ensure the atomicity of the commit, so as to ensure that the commit order of the two logs is consistent in the case of multiple transactions.

26. MySQL read and write separation, how to synchronize the master and slave, how to solve the problem of synchronization delay?

By setting up a MySQL master database (responsible for data writing and updating) and multiple slave databases (responsible for data query), through the master-slave synchronization mechanism, the business requirements of the query are allocated to the slave databases, reducing the burden on the master database, thereby improving performance of the database.

The process of master-slave synchronization:
1. Before the update data of each transaction is completed, the master writes (addition, deletion, modification) operation records to the binlog file in an appended manner.
2. The slave opens an I/O Thread and establishes a connection with the master. The main job is to read the bin log of the master. If the reading progress has kept up with the master, it goes to sleep and waits for the master to generate new events. The ultimate purpose of the I/O thread is to write the master to the forwarding log relay log.
3. The SQL thread of the slave will read the relay log and execute the SQL events in the log sequentially, so that the data in the slave library is consistent with the master library.

insert image description here

In the above master-slave synchronization mechanism, there is inevitably a delay between the data in the slave library and the latest data in the master library. A short delay is acceptable, but in some cases, this delay will be very serious:

  • 1. Querying business needs is the main factor, resulting in too much pressure from the database and slowing down SQL execution.
  • 2. The master library executes a time-consuming transaction, and the delay in transmission to the slave library is higher.

For the above reasons, you can use:

  • One master and multiple slaves, share the pressure of the slave library.
  • For business classification, important businesses with high real-time requirements are directly queried through the main database.
  • Using the semi-synchronous mechanism, the main library only needs to wait for at least one slave library to receive and write to the Relay Log file, and the main library does not need to wait for all slave libraries to return ACK to the main library. After the main library receives this ACK, it can return the confirmation of "transaction complete" to the client.
  • Using parallel replication, start multiple threads from the library to read the logs of different libraries in the relay log in parallel, and then replay them in parallel to the corresponding library.

SQL optimization

26. How to locate SQL and optimize the performance of SQL statements?

Use the SQL performance analysis tools provided by MySQL:

  • Slow query log: It records the log of all SQL statements whose execution time exceeds the specified parameters. show variables like 'slow_query_log';You can view the enabled status of the slow query log through the statement .
  • Profile details: Using profile details allows us to understand where the SQL time is spent.
  • explainUse the explain execution plan, that is, add the sum before the SQL statement descto analyze the execution of the SQL statement, whether to use the index, the number of bytes in the index, and the connection type.
  1. View MySQL 慢查询日志and locate the SQL id.
  2. Using profileDetails, you can see where the specific time of the query is spent.
  3. Using explainor desc+ SQL statement, you can analyze the execution plan of the SQL statement, and there are two fields that deserve our attention:
    • type: Whether to use an index and the type of index used, the efficiency of the field value from high to low is:
      • const: equivalent query for primary key or unique index
      • eq_ref: Usually occurs in associated queries, and the associated condition is the primary key of the table or the only non-empty index.
      • ref: ref is generally used for equivalent queries of non-clustered indexes.
      • ref_or_null: In the inheritance of ref, it is also necessary to find the result of null. Since the query of null value needs to scan the row information of the entire index tree, it will be slower than ref.
      • index_merge: multiple results will be merged into one, and unified back to the table query.
      • range: range query for any index, including like, between, >, <, etc.
      • index: full table scan, no index, but you can get the required results on the index tree.
      • all: Full table scan, not all query results can be obtained from the index tree.
    • extra: extra information,
      • using index (covering index)
      • using index condition (index pushdown)
      • using where (no index pushdown)
      • using temporary (using a temporary table)
      • using filesort (additional sorting is required after the query).
  4. Check whether the data rows in the table are too large, and consider whether to divide the database into tables
  5. Check the server occupancy where the database is located, and consider upgrading server performance.

SQL optimization operations:
1. Batch insertion, primary key sequence insertion, avoid connecting to the database multiple times, and use the load command to insert data in large quantities.
2. Primary key optimization: To meet business needs, minimize the length of the primary key and ensure that the primary key is auto-incremented.
3. Business scenarios that need to be sorted, that is, using the order by statement, by establishing a suitable index, the joint index follows the leftmost prefix rule, and pay attention to the sorting order, which is ascending by default.
4. For statements that need to be grouped and queried, even statements that use group by can also be built with a joint index to speed up the query.
5. Limit optimization (super large paging optimization), MySQL does not skip offset rows, but takes offset+N rows, and then returns to the previous offset row and returns N rows. When the offset is particularly large, the efficiency is very low , either to control the total number of pages returned, or to do SQL rewrites for pages that exceed a certain threshold.
Use the subquery method to locate the position of the limit first, and then return.
For example: SELECT a.* FROM table 1 a, (select id from table 1 where condition LIMIT 100000,20 ) b where a.id=b.id

6. The update statement must be positioned according to the index to avoid full table scanning, and the key lock is upgraded to a table lock.

27. How to optimize large table data query?

  • build index
  • via redis cache
  • Master-slave assignment, read-write separation
  • The core idea of ​​sub-database and sub-table is to store data in a decentralized manner, so that the data volume of a single database/table is reduced to alleviate the performance problem of a single database, thereby improving the overall performance of the database.

Redis

basic concept

1. Why use Redis as MySQL cache?

Mainly because Redis has two characteristics of high performance and high concurrency.

  • High performance: Accessing data from MySQL is read from the hard disk, while using Redis is read from the memory, which is more efficient.
  • High concurrency: The QPS of a single Redis can handle requests per second 10 times that of MySQL, and the number of requests that can be tolerated by direct access to Redis is much higher than that of direct access to MySQL.

data structure

2. What data types does Redis contain? What is the usage scenario?

  • String type, application scenarios include: cache objects, distributed locks, and shared session information.
  • List type, application scenarios include: message queue. (However, there are two problems: 1. The producer needs to realize the globally unique ID by itself; 2. The data cannot be consumed in the form of consumption)
  • Hash type: cache object, shopping cart, etc.
  • Set type: Unique and requires aggregation calculation (union, intersection, difference) scenarios, such as likes, common attention, lottery activities, etc.
  • ZSet type: unique and sorting scenarios, such as leaderboards, phone calls, and name sorting.
  • BitMap: Scenarios of binary state statistics, such as sign-in, judging user login status, total number of consecutive sign-in users, etc.;
  • HyperLogLog: Scenarios for massive data cardinality statistics, such as UV statistics for millions of web pages.
  • GEO: A scenario where geographic location information is stored, such as Didi calling a car.
  • Stream: message queue. Compared with the message queue based on the List type, Stream can automatically generate a globally unique message ID, and supports data consumption in the form of consumer groups, and each subscribed consumer group will receive the sending of the message.

3. How are the underlying Redis data types implemented in the five scenarios?

  • The bottom layer of string is implemented by SDS (Simple Dynamic String), which includes the length len of the string, the number of space allocated for alloc, the header type flag, and the character array buf that actually stores data. Compared with the traditional character string in C language Arrays have the following advantages:

    • String length can be obtained in O(1) time
    • Support dynamic expansion, and reduce the number of memory allocations through the memory pre-allocation mechanism.
    • Read the string content according to the length, not the end mark '/0', so it is binary safe.
  • The bottom layer of the list is implemented by quicklist. Each node in the quicklist is a ZipList. Using ZipList, the space utilization rate of ZipList is higher. Using multiple small ZipList connections can speed up the efficiency of memory application and find multiple small continuous memory. space, much easier than finding a large chunk of contiguous memory space.

  • The bottom layer of the hash is implemented by ZipList by default, and the higher version uses listpack, and the two adjacent entries in ZipList store the field and value respectively. However, when the amount of data is large, Dict will be used for implementation.

  • The bottom layer of Set is implemented based on Dict, filed stores the data of Set, and value stores null. When the stored data are all integer types, the integer set IntSet is used to implement.

  • ZSet is an ordered list, implemented based on Dict and skiplist.

insert image description here

Redis thread network model

4. Is Redis single-threaded?

The core function of Redis, the analysis, execution, and result sending of the addition, deletion, modification, and query instructions of database key values ​​​​are executed by a single thread, because Redis itself operates on memory, and the execution efficiency is very fast. When using multi-threading, thread safety must be considered. Switching overhead will affect its efficiency, but for some non-core functions, such as file closing, AOF flashing, memory release, and network IO request processing, multi-threading is used to improve efficiency.

The core function of Redis: receiving client instruction requests, parsing requests, performing operations such as data reading and writing, and sending data to the client. This process is completed by a thread (main thread), so it can be said that the core function of Redis is single implemented by the thread.

But the overall implementation of Redis is not single-threaded.

  • Redis will start 2 background threads in version 2.6 to handle the two tasks of closing files and AOF flashing respectively.
  • After Redis version 4.0, a new background thread was added to asynchronously release Redis memory, which is lazyfree thread.
  • After Redis version 6.0, multiple I/O threads are used to process network requests.

The reason why separate threads are created for tasks such as closing files, AOF flashing, and releasing memory , is because the operations of these tasks are time-consuming. If they are completed on the main thread, they are prone to blockage.

5. What is the Redis network IO processing mode?

insert image description here
The first step: Redis initialization:

  • First call epoll_create() to create an epoll object, and call socket() to create a server socket.
  • Then call bind() to bind the server port number, and call listen() to listen to the socket.
  • Call epoll_ctl() to add the listen socket to epoll, and register the connection event processing function at the same time.

Step 2: Event loop function:
After the initialization is completed, the main thread enters an event loop function , which mainly does the following:

  • First, call the processing send queue function to see if there is a task to send. If there is a sending task, the data in the client’s sending buffer is sent and processed through the write function. If it is not sent in this round, the write processing function will be registered, and it will be processed again after epoll_wait finds out.
  • Next, call the epoll_wait function to wait for the event to arrive:
    • If it is a connection event, call accept() to get the connected socket, call epoll_ctl to add the connected socket to epoll, and register the read event processing function.
    • If the read event arrives, call the read event processing function, call read() to obtain the data sent by the client, parse the command, process the command, add the client object to the sending queue, write the execution result to the sending buffer, and wait for sending .
    • If a write event arrives, the write event processing function will be called, and the write() function will be called to send the data in the client’s send buffer. If this round of data has not been sent, the write event processing function will continue to be registered. Wait for epoll_wait to find that it is writable before processing.

Redis persistence

6. How does Redis prevent data loss?

The read and write operations of Redis are all in memory, so the performance of Redis will be high. However, when Redis crashes or restarts, the data in memory will be lost. In order to ensure that the data in memory is not lost, Redis implements a data persistence mechanism. Store the data to the disk so that the original data can be restored from the disk when Redis restarts.
Redis has the following three data persistence methods:

  • RDB snapshot: Write the memory data at a certain moment to the disk in binary form.
  • AOF log: every time a write operation command is executed, write the command to a file in the form of appending;
  • The hybrid persistence method, after Redis4.0, integrates the respective advantages of AOF and RDB.

7. How to realize RDB snapshot?

Redis provides two commands to generate RDB files, namely save and bgsave.

  • The save command is executed on the main thread to generate an RDB file, which will block the main thread. During this period, no Redis instructions can be executed.
  • The bgsave command, that is, the background save, executes the bgsave command, will fork () create a child process, and copy the page table of the parent process, but the physical memory pointed to by the page table of the parent and child processes is still the same. If a read operation is performed at this time, the parent and child processes are mutual It does not affect, but if a write operation is performed, a copy of the modified data will be created, and then the bgsave subprocess will write the copy to the RDB file.
    insert image description here
    insert image description here
    In extreme cases, if all shared memory is modified, the memory usage at this time is twice the original! ! ! . Therefore, Redis should reserve memory when using it.

8. How to realize AOF log?

AOF concept:
AOF (append only file): After Redis executes a write operation command, it will append the command to a file, and then when Redis restarts, it will read the commands recorded in the file and execute them one by one command to restore data.

AOF triggers rewriting:
As the AOF log executes more and more write operation commands, the file size becomes larger and larger. When the size exceeds the set threshold, Redis will enable the AOF rewriting mechanism. If the rewriting mechanism is not used Before, if the two commands "set name xiaolin" and "set name xiaolincoding" are executed before and after, these two commands will be recorded to the AOF file. After using the rewriting mechanism, the latest value (key-value pair) of name will be read, and then a "set name xiaolincoding" command will be used to record to the new AOF file. There is no need to record the first command before, because It belongs to the "History" command and has no effect.

AOF rewriting process: Redis rewriting AOF process is done
by background sub-process . bgwriteaofDuring sub-process AOF rewriting, the main process can continue to issue command requests to avoid blocking the main process. The rewrite child process is read-only to the memory, converts the key-value pair of the memory data into a command, and then records the command to the rewrite log (new AOF file).

If the main process modifies existing data in the rewriting AOF log, copy-on-write will occur. Redis sets up an AOF rewrite buffer, which bgwriteaofstarts to be used after the child process. During the rewrite of AOF, when Redis executes a write command, it will write the write command to the AOF buffer and AOF at the same time. Overwrite the buffer. When the child process completes the AOF rewriting work, a signal will be sent to the main process. After the main process receives the signal, it will append all the content in the AOF rewriting buffer to the new AOF file, so that the database status saved by the old and new AOF files is consistent, and the new AOF file will be renamed to overwrite the existing one. AOF file.

9. How to achieve hybrid persistence?

  • The advantage of RDB is that only the data recorded in the memory is saved, the recovery is fast, and the file is small, but there is a risk of data loss and insufficient security. The frequency of RDB is also difficult to grasp. If the frequency is too low, the lost data will be larger. If the frequency is too high, performance will be affected.
  • The advantage of AOF is that there is less data loss and it is safer, but the AOF file is usually relatively large, but the data recovery speed is slow.
  • Redis 4.0 proposes the mixed use of AOF and RDB to integrate their respective advantages. When hybrid persistence is enabled, when AOF rewrites the log, the rewritten child process from fork will first write the memory data shared with the main thread to the AOF file in RDB mode, and then the main thread processing operation command will be Recorded in the rewrite buffer, the incremental commands in the rewrite buffer will be written to the AOF file in AOF mode. After the writing is completed, the main process will be notified to replace the old AOF with the new AOF file containing RDB format and AOF format document.

That is to say, during hybrid persistence, the first half of the AOF file is full data in RDB format, and the second half is incremental data in AOF format

The advantage of this is that when restarting Redis to load data, since the first half is the RDB content, the loading speed will be very fast. After the RDB content is loaded, the second half of the AOF content will be loaded. The content here is the operation command processed by the main thread during the rewriting of the AOF by the Redis background sub-process, which can reduce data loss.

Redis cache design

10. How to avoid cache penetration, cache avalanche, and cache breakdown?

cache penetration

Cache penetration refers to client requests 数据在缓冲中和数据库中都不存在, so that the cache will never take effect, and these requests will be hit in the database.

There are generally two situations in which cache penetration occurs:

  • Business misoperation, the data in the cache and the data in the database are deleted by mistake, so there is no data in the cache and the database;
  • Hackers maliciously attack and intentionally access a large number of services that read non-existent data;

There are three common options:

  • 1. Restrict illegal requests. At the API entrance, judge whether the request parameters are reasonable and whether it is a malicious request.
  • 2. Cache null values ​​and set a shorter TTL. Subsequent requests will hit the cache, but at the same time, in order to avoid a large number of such caches, they are automatically cleared by setting a shorter TTL.
  • 3. Use the Bloom filter to determine whether the data exists. When writing data to the database, use the Bloom filter to make a mark. When the user requests it, after the business judges that the cache is invalid, it can quickly determine whether the data exists by querying the Bloom filter. If it doesn't exist, don't query the database.

cache avalanche

大量缓存数据在同一时间过期(失效)时, if there are a large number of user requests at this time, they cannot be processed in Redis, so all requests directly access the database, which will cause a sudden increase in the pressure on the database, and seriously cause the database to go down, thus forming a series of chain reactions. The system crashes, this is the problem of cache avalanche.

For the cache avalanche problem: we can solve it with the following solutions:

  • 1. Randomly scramble the buffer expiration time TTL to reduce the probability of collective cache failure.
  • 2. Use Redis cluster to improve service availability
  • 3. Add a degraded current limiting strategy to the cache business
  • 4. Add multi-level cache to the business.

cache breakdown

The cache breakdown problem is also called 热点keya problem. It is a problem 高并发访问并且缓存重建业务较复杂的key突然失效了(hotspot key invalidation). Countless access requests will bring a huge impact to the database in an instant, and multiple threads will try to rebuild the cache, resulting in tight server performance. But actually only one thread rebuilding the cache is enough

Cache breakdown is very similar to cache avalanche, you can think of cache breakdown as a subset of cache avalanche.

Common solutions are as follows:

  • 1. Do not set TTL for the hotspot key, and update the cache asynchronously by the background, or notify the background thread in advance to update the cache and reset the expiration time before the hotspot data is about to expire;
  • 2. Lock the process of rebuilding the cache, and block other threads without locks to ensure that only one business thread rebuilds the cache at the same time. However, the mutex causes multiple threads to wait, which will still cause performance impact.
  • 3. Based on the logical expiration policy, set the logical expiration field for the cache expire. The first thread that finds the cache expiration gets the lock, starts a new thread, executes the cache reconstruction expiration, and directly returns the expired cache. The rest of the threads that have not acquired the lock will directly return to the expired cache without blocking until the cache is successfully rebuilt. This method is weakly consistent, but the performance will be better.

11. How to maintain consistency between database and cache?

Common cache update strategies include:

  • Cache Aside Policy
  • Read/Write Through (read through/write through) strategy
  • Write Back Policy

In actual development, the update strategies of Redis and MySQL are both Cache Aside (bypass cache) strategies.

The Cache Aside (bypass cache) strategy is the most commonly used. The application directly interacts with the database and the cache, and is responsible for maintaining the cache. This strategy can be subdivided into a read strategy and a write strategy.

  • Write strategy: update the data in the database first, and then delete the data in the cache.

  • Read strategy: If the read data directly hits the cache, the data is returned directly; if there is no hit, the data is read from the database and then written to the cache.

insert image description here

When writing data, update the database first, and then update the cache. There may be problems during concurrency:
insert image description here
In order to avoid the inconsistency between the above cache and data, when writing data, delete the cache after updating the database.

So why update the database first, then delete the cache?
Because the operation on the cache is faster than the operation on the database, update the cache first and then update the database, the time difference between data inconsistencies is greater, resulting in inconsistent results from the database and the cache The probability is higher! But update the database first, and then delete the cache, the time to delete the cache is faster, and the time difference in the middle is smaller.

12. What is the expiration deletion strategy of Redis?

Whenever we set a TTL for a key, when Redis constructs the key's data, it will additionally store an expired dictionary (expire dict). When we query a key, Redis first checks whether the key exists in the expired dictionary. If it does not exist, it will be read correctly. If it exists, it will check the expiration time of the key and compare it with the current time to detect whether the key has expired.

The expired deletion strategy adopted by Redis is lazy deletion + regular deletion

The approach of the lazy deletion strategy is to not actively delete the expired key, and to check whether the key is expired every time the key is accessed from the database, and delete the key if it is expired.

  • Advantages: Because the key is checked for expiration every time it is accessed, this strategy uses very little system resources. Therefore, the lazy deletion strategy is the most friendly to CPU time.

  • Disadvantages: If a key has expired, but the key is still kept in the database and has not been accessed, memory space is wasted.

For this reason, Redis will use a regular deletion strategy to cooperate with the above lazy deletion strategy.

The regular deletion strategy is to take a certain number of keys from the database "randomly" for inspection every once in a while, and delete the expired keys.

Periodic deletion has two working modes:

  • In the initServer() function of Redis, the expired key will be cleaned according to the frequency of server.hz, the mode is SLOW, and the default is 10 times per second.
  • Before each event loop of Redis, the beforeSleep() function is called back to perform the cleaning of expired keys, and the mode is FAST.
  • SLOW performs low frequencies but cleans up more thoroughly. The FAST model executes at high frequencies, but at short intervals.

13. What are the memory elimination strategies of Redis?

Timing of implementing the memory elimination strategy:
Redis will trigger the memory elimination mechanism if OOM is found in the processCommand() method of processing client commands.

Redis supports 8 different strategies for key elimination. The default elimination strategy is:
1. noeviction, no key is eliminated, but new data is not allowed to be written when the memory is full.

Other common memory elimination strategies are usually divided into processing all keys and processing keys with TTL set.

2. Including the priority elimination of keys that will expire according to the TTL.
3-4. All keys are randomly eliminated, and keys with TTL are randomly eliminated.

5-6. Eliminate all keys or keys with TTL set based on the LRU algorithm.
7-8. Eliminate all keys or keys with TTL set based on the LFU algorithm.

LRU (least recently used), least recently used. Subtract the last access time from the current time. The larger the value, the higher the priority of elimination.
LRU is time-based and time-sensitive. Recently, newly emerging hot keys are more likely to be resident in memory, while those keys that have not been used for a long time are more likely to be eliminated.
LFU (least frequently used), the least frequently used. The access frequency of each key will be unified, and the smaller the value, the higher the elimination priority.
LFU is frequency-sensitive, and is more suitable for storing keys that contain multiple hotspots. The last access to a certain key was at 12:00 noon yesterday, and within 5 minutes at 12:00 yesterday, 200 high-frequency accesses were made. If today There is still such a demand at noon. According to the LFU strategy, yesterday's hot keys will be resident in memory.

Redis master-slave, slice cluster

14. What are the defects of Redis master-slave replication principle?

The Redis master-slave replication implementation scheme uses multiple Redis servers, one master and multiple slaves , and the read-write separation mode is adopted between the master-slave services. The master server can perform read and write operations. When a write operation occurs, the write operation will be automatically synchronized. To the server, and the slave server is generally in read-only mode, and receives the write operation command synchronized from the master server.

insert image description here

Since the command replication between the master and slave servers is asynchronous, it is impossible to achieve strong consistency of master and slave data.

The principle of master-slave server data synchronization is as follows:

insert image description here
insert image description here

15. How is the sentinel mechanism implemented in Redis master-slave mode?

Redis provides a Sentinel mechanism to realize the automatic failure recovery of the master-slave cluster. The main functions of the Sentinel include server status monitoring, automatic failure recovery, and notification .

  • Server Status Monitoring

    • The sentinel monitors the server status based on the heartbeat mechanism, and sends a ping command to each instance of the cluster every 1 second; if a sentinel finds that the instance does not respond within the specified time, the instance is considered to be offline subjectively. When the specified number (generally greater than Half of the total number of sentinels) all subjectively believe that the instance is offline, and the instance node is objectively offline .
    • Once the objective offline of the master node is detected, the first sentinel who finds the subjective offline of the master node needs to select one of the slaves as the new master. Basis for election:
      • Determine the length of time that the slave node is disconnected from the master node, and the slave that exceeds the specified value is directly excluded.
      • Determine the priority value of the slave node, the smaller the value, the higher the priority.
      • Determine the id size of the slave node running, the smaller the id, the higher the priority.
  • automatic failure recovery

    • The selected slave executes slave of no one and becomes the new master.
    • Notify all other slave nodes to execute slave of new master.
    • Modify the configuration of the failed master node and add a slave of the new master.

16. Redis fragmentation cluster

When the amount of Redis cached data is too large for one server to cache, it is necessary to use the Redis slice cluster (Redis Cluster) scheme, which distributes the data on different servers, so as to reduce the system's dependence on a single master node, thereby improving The read and write performance of the Redis service.

The Redis Cluster solution uses hash slots (Hash Slot) to handle the mapping relationship between data and nodes. In the Redis Cluster scheme, a slicing cluster has a total of 16384 hash slots. These hash slots are similar to data partitions, and each key-value pair is mapped to a hash slot according to its key.

How are these hash slots mapped to specific Redis nodes? There are two options:

  • Even distribution: When using the cluster create command to create a Redis cluster, Redis will automatically distribute all hash slots to the cluster nodes evenly. For example, if there are 9 nodes in the cluster, the number of slots on each node is 16384/9.
  • Manual allocation: You can use the cluster meet command to manually establish connections between nodes to form a cluster, and then use the cluster addslots command to specify the number of hash slots on each node.
    insert image description here

Redis combat

Guess you like

Origin blog.csdn.net/baiduwaimai/article/details/129841075
Recommended