Hello Mysql! I want to meet you again! ! !

As a programmer, I believe that 99.9% of people are familiar with MySQL. People who come from universities or training courses have been exposed to MySQL. In fact, after such a long time of contact, they still have a little knowledge of MySQL. Whether writing sql at work or optimizing, I don’t have a thorough understanding of mysql. It's just a good level, so I decided to get to know mysql again.

Mysql query statement execution process

When we execute a query statement in some Navicat or other database management tools, the database will return the corresponding data to us. So from the execution to the return of the data, what steps does the database go through?
Insert picture description here
As shown above.
From the moment you click to execute sql, Navicat needs to make a connection request to the database. After the request is successful, a query will be made to the query_cache cache to see if the result of your current SQL is in the cache. If there is, the data will be returned directly. If there is no cache, it will go to the Paser parser for lexical analysis and grammatical analysis . After the analysis is completed, it will come to the preprocessing for permission verification and semantic analysis , and then the sql will be handed over to the optimizer for optimization. After the optimizer is optimized, it will be handed over to the execution plan. Finally, the executor is responsible for executing in the storage engine according to the disassembled steps of the execution plan, and finally returning the result to the client.

Glossary:

Lexical analysis: In simple terms, it is to disassemble the sql statement into multiple elements, for example: select name from
userTable; lexical analysis will disassemble sql into select, name, from, and userTable. Among them, select and from are keywords, and name and userTable are non-keywords.
Syntax analysis: It will judge whether this piece of sql conforms to sql grammar. For example, if from is typed into form, then an exception will be thrown after the grammatical analysis reaches an error. And the grammar analysis will also generate a parse tree from SQL, which is similar to the following figure.
Semantic analysis: The parse tree after grammatical analysis will be checked to see if there is any problem with the parse tree, such as whether there is a keyword error, whether there is a problem with the order of the keywords, etc.
Optimizer: Some SQL we usually write, the optimizer will optimize it once to make it use the index as much as possible to improve query efficiency. For example, we have established a joint index: the index field is name, phone
and our sql is select name from userTable where phone = '1562498497x' and
name ='silence'. At this time, the optimizer will adjust sql to: select name from userTable
where name ='silence' and phone = '1562498497x'
. This joint index will be used when SQL is executed.
Execution plan: The execution plan is a key point. For the explanation of the execution plan, you can read this blogger's blog. It is very well written, and each field has examples. https://www.cnblogs.com/yinjw/p/11864477.html

Index considerations

Creating an index is an important means to improve SQL execution efficiency, so we also need to pay attention to the following matters when creating an index:

The number of indexes is not as many as possible. It is not that all fields in the table are indexed, and the execution efficiency will be improved.
Do not create an index for fields with low hash degree (distinct number of rows/total number of rows).
Do not create indexes for fields that are randomly disordered or frequently updated.
Avoid redundant indexes when creating composite indexes.

The following situations will cause the index to become invalid:

Functions, expressions, operators are used
Type implicit conversion
There is% before the like conditional character
Used <>,!=, not in

The role of binLog, redoLog, undoLog

binLog

binLog is mainly used for master-slave replication, and data synchronization in a Mysql cluster needs to rely on binLog to achieve. The binLog records the commands to change the data. It does not record the commands such as select and show because there is no change to the data.

redoLog

redoLog is mainly used for crash recovery. It mainly records changes in transaction operations and records the values after data modification. Regardless of whether the transaction is committed or not, it will be recorded. If the transaction is rolled back, the recorded data will also be rolled back.
If there is a sudden power failure or a sudden system downtime, the InnoDB engine will use redoLog to restore the data to the moment before the power failure/downtime to ensure data integrity.

undoLog

undoLog guarantees the atomicity of the transaction. Stored in undoLog is the historical version of the data. In InnoDB, the insert operation is only visible to the current transaction before the transaction is committed, and is invisible to other transactions. The inert data recorded in the undoLog will be deleted after the transaction is committed, because the newly inserted data has no historical version. For update, delete operation will keep multiple versions.

For example: transactions S1 and S2 want to access data A at the same time, S1 wants to modify data A to B, and S2 wants to query data A, so before there is no MVCC (multiple concurrent version control), it can only rely on locking, who will first Whoever owns the lock executes first, and the other waits. However, under high concurrency, such locking will lead to inefficiency. Therefore, the InnoDB engine reads the row data in the current execution time database through MVCC. If the read row is performing update and delete operations, the read operation will not wait until the change operation is completed before querying, but read a snapshot data (undoLog). In InnoDB, to process a piece of data, it will first look at whether the version number of this piece of data is greater than the version number of the current transaction. If it is greater, it needs to be read from the historical snapshot to ensure data consistency. The historical data exists in the undoLog, and the lock added to the data modification will not affect the undoLog, so it will not delay the user's hungry read operation of the historical data, so as to achieve non-consistent lock reading and improve concurrent performance.

MVCC

MVCC (Multi version concurrence controller) multi-version concurrence control is a method of concurrency control.

Four characteristics of transactions

Atomicity: Either all succeed or all fail. The transaction log undoLog is used to implement the data rollback operation to ensure the atomicity of the transaction.
Persistence: Once the transaction is committed, it is saved to the disk and will not disappear out of thin air. Rely on redoLog + double write buffer (double write buffer) to achieve.
Isolation: Before a transaction is executed, other transactions have data addition, deletion and modification operations, which are invisible to transaction A.
Consistency: Ensure the integrity of the data is consistent before and after the transaction.

What are the problems caused by transaction concurrency? What are the four transaction isolation levels?

Dirty reads, non-repeatable reads, and phantom reads will occur in transaction concurrency.

Dirty read: When a transaction A has not been committed, transaction B has an update operation for the row data and has not been committed. At this time, transaction A queries again and finds the modified value of transaction B. This situation is called dirty read.
Non-repeatable read: When transaction A has not yet been committed, transaction B modifies or deletes the data in the row and commits the transaction. At this time, transaction A queries again to find the modified value of transaction B. This situation is called Can not be read repeatedly.
Phantom read: When transaction A has not yet committed, transaction B performs an insert operation and commits the transaction. At this time, transaction A queries again and finds the data inserted by transaction B. This situation is called phantom read.

The four transaction isolation levels are: uncommitted read (Read UnCommitted), committed read (Read Committed), repeatable read (Repeatable Read) and serialization (Serialiable)
Insert picture description here