"Mysql actual combat 45 lectures" reading notes

 

Mysql logical architecture
MySQLåºæ¬æ¶æ示æå¾

  • Connector. Responsible for establishing connection with the client, authorization authentication, maintaining and managing connection. If the user name and password are correct, the connector goes to the permission table to check the corresponding permission of the user. Q: After user A logs in successfully, the login permission is disabled by the administrator, and will user A go offline at this time? Answer: No. Permission authentication is authenticated during connection and will not affect users who have logged in successfully. At this time, if A logs out and logs in again, it will display "ACCESS DENIED for users"
  • Query cache. The executed SQL statement will be cached in memory in the form of key_value pairs. The key is an SQL statement and the value is the result. If this SQL is executed again, the value will be returned directly to the client. Advantages: If it is a static table, it is rarely updated, the query cache hit rate will be high, which can save query time; Disadvantages: If the table is updated, the cache will be cleared, not suitable for frequently updated tables. The usage is as follows:
mysql> select SQL_CACHE * from T where ID=10;
  • Analyzer. Lexical analysis: Parse out the strings and spaces in SQL statements, that is, keywords, table names, and field names. Syntax analysis: Analyze SQL grammar. If a field that does not exist is queried in the SQL statement, an error will be reported at this stage.
  • Optimizer. Select the appropriate index; when multiple tables are joined, determine the driving table and driven table; select the optimal SQL execution plan.
  • Actuator. First determine whether the user has corresponding permissions on the table (the query will also call precheck to verify permissions before the optimizer), and execute the SQL statement.
  • Storage engine.
  InnoDB MyISAM Memory
Business stand by not support  
Lock granularity

Support high concurrency based on MVCC;

Achieve four transaction isolation levels;

The lock granularity is row lock

Does not support row lock;

Add a shared lock when reading the table;

Add an exclusive lock when writing a table;

Lock the entire table

 
Data storage form

The data is stored in memory;

The table structure is stored on the disk, the file name is the same as the table name, and the extension is .frm

index     Use hash index by default

Mysql log function

In MySQL, if every update operation needs to be written to the disk, then the disk must also find the corresponding record, and then update, the entire process of IO cost and search cost are very high. In order to solve this problem, the designers of MySQL used redo logs to improve the update efficiency.

WAL: Write-ahead log technology. Write log first, write disk

redo log concept

Redo log is a characteristic of InnoDB engine. When updating data, InnoDB first writes the record to the redo log, and then updates the memory until the update operation is completed in this step. InnoDB writes operation records to disk at the appropriate time.
The redo log can be fixed in size, and a group can be configured with 4 files, each file is 1GB. Write redo log files in a ring.

redolog循ç¯å

check_point is the current position to be erased (corresponding data will be written to disk before erasing), and write_pos starts writing. When check_point and write_pos meet, it means that the redo log is full. You need to stop the database update operation and synchronize the redo log data to the disk. Redo log guarantees that after Mysql restarts abnormally, the previous data will not be lost, this ability is more crash-safe.

binlog concept

It is a log archive file of the Mysql server layer, and has nothing to do with the storage engine.

  binlog

redo log

  Mysql server layer implementation InnoDB specific
  Logical log; record the original logic of this statement, for example, "add 1 to field a in the row with id 1" Physical log; record "what modification is made on a certain data page"
  Append to write without overwriting Space is fixed, cyclic write, will overwrite

Update operation execution flow

  1. The engine takes out a row of data whose Id is 2. If id is the primary key, it will directly use a tree search to find this row of data. If the engine finds that the memory page where this row of data is located is already in memory, it directly returns to the executor; if not, it reads the memory from the disk and returns it to the executor.
  2. The executor updates the data, calls the engine interface to pass the data to the storage engine,
  3. The engine updates the data to the memory (InnoDB buffer Pool). At the same time, the data is sucked into the redo log. At this time, the redo log is in the prepare state. The engine passes this information to the actuator.
  4. The actuator writes this operation information to binlog.
  5. The executor calls the engine transaction submission interface, and the engine changes the redo log state to the commit state.

update语å¥æ§è¡æµç¨

To ensure the consistency of redo log and binlog data, two-phase commit is used

 

 

Transaction isolation

Transaction: A group of SQL statement operations. The SQL statement in the transaction is either executed successfully or failed.

Four characteristics of affairs: ACID

Atomicity

 

 

 

Why is Mysql read by default?

 

  • Dirty read: Dirty read refers to reading data in another uncommitted transaction during a transaction process. When a transaction is modifying a certain data multiple times, and the multiple modifications in this transaction have not been submitted, then a concurrent transaction to access the data will cause the data obtained by the two transactions to be inconsistent.

  • Non-repeatable read: Non-repeatable read means that for a piece of data in the database, multiple queries within a transaction range return different data values ​​(here different means that the content of one or more pieces of data is inconsistent, but the number of pieces of data The same), this is because in the query interval, the data required by the transaction was modified and submitted by another transaction. The difference between non-repeatable reads and dirty reads is that dirty reads read dirty data that was not committed by another transaction, while non-repeatable reads read data submitted by other transactions. It should be noted that in some cases non-repeatable reading is not a problem.

  • Phantom read: Phantom read is a phenomenon that occurs when transactions are not executed independently. For example, transaction T1 modifies a data item of all rows in a table from "1" to "2". At this time, transaction T2 inserts a row of data item into this table, and the value of this data item Still "1" and submitted to the database. If the user who operates transaction T1 looks at the data that was just modified, there will be a row that has not been modified. In fact, this row was added from transaction T2, as if an illusion has occurred. This is a phantom reading. Both phantom reads and non-repeatable reads read another transaction that has been committed (this is different from dirty reads), the difference is that non-repeatable reads may occur in update and delete operations, and magic reads occur in insert operations .

 

index

The index is a data structure used in the storage engine to quickly find records. Common data structures used for indexing include hash tables, ordered arrays, and search trees.

Hash table

Enter the key to be found and calculate its corresponding value according to the hash function. If different keys get the same value, then create a linked list at this location to store the data of the same key.

Advantages: the operation of adding, deleting, and modifying is faster, and is suitable for equivalent search.

Disadvantages: the disorder of the key. When the specified interval is searched, a full table scan is required.

Ordered array

Advantages: suitable for equivalent search, range query

To find a specific value, you can use the array index index, at this time the time complexity is O (1); if it is an ordered array, you can use the binary search method, the time complexity is O (log (n));

To find the value of a certain range, you can use binary search to find the data on the left side of the range, and search until it is not greater than the data on the right side of the range.

Storage engine suitable for storing static data

Disadvantages: The overhead of adding and deleting data is very large. The best case is O (1), the worst case is O (n), and the average complexity is O (n)

Search tree

Let's first discuss the basic binary search tree

Refer to this blog for detailed information about the binary search tree.

The characteristic of the binary search tree is that the left child node is smaller than the parent node, and the parent node is smaller than the right child node. The search time complexity is O (log (n))

In practice, the database index hardly uses a binary search tree. Because when the amount of data is very large, if all the indexes are stored in memory, the cost will be very large. For example, a balanced binary tree with a data volume of 100w is about 20 layers high. That is, to access one piece of data, you may need to access 20 blocks of data on the hard disk. The addressing time of the data block of the mechanical hard disk is about 10ms, that is, for 100w of data, it takes about 20 * 10ms to find a certain data, which is very time-consuming.

To reduce the number of data blocks read, the height of the tree should be reduced. The tree is about to change into an N-ary tree. The data blocks at the root of the tree are stored in memory.

InnoDB's index model

The table is stored in the form of an index according to the order of the primary key. This way of storing the table becomes the index organization table.

Primary key index (InnoDB is called a clustered index): the leaf nodes of the primary key index store the entire row of data. According to the primary key, do a B + tree search to get the corresponding value.

Non-primary key index (InnoDB is called secondary index): The leaf nodes of the non-primary key index store the value of the primary key. According to the non-primary key index, the value of the primary key is obtained first, and then the data of this row is obtained according to the value of the primary key (this process is called back to the table). The query should be based on the primary key index.

 

 

Global locks and table locks

Global lock: For the entire library to add a read lock, it is suitable for global backup of the standby library.

 

Table-level lock:

Read-write lock:

Metadata lock: when querying table data, it will lock its metadata table and read lock

 

 

 

 

 

 

 

 

 

 

 

Storage engine comparison

https://www.cnblogs.com/kevingrace/p/5685355.html

The difference between binlog and redo log

https://juejin.im/post/5db191cf5188254b7b003f12

Published 61 original articles · won praise 2 · Views 7303

Guess you like

Origin blog.csdn.net/hebaojing/article/details/103403350