Mysql underlying principle: I would like to teach you my life skills

The more you learn, the less you know.

1. Introduction

The database is a place where data is stored. There are several different
storage formats. The bottom layer is the default 16kb page storage. The middle node of the b + tree can store many leaf nodes on one page. It depends on how large the data is.
Important: The bottom layer stores the data page in the format of a b+ tree (understanding: the data page is found by the b+ tree) Store the row data of the linked list on the page (understand: the 768 bytes of the variable-length column value in the row header are used as an index. Put it in other parts) is
divided into data blocks and index blocks.
When creating a table, a b+ tree index is built in the database with the primary key by default. All data is placed on the leaf node, and the primary key is used as the intermediate branch.
When creating an index At this time, a leaf node of a b+ tree is placed in the index block. The value and primary key of this column are all placed

Database storage engine: (the underlying engine)

  • innoedb: support transactions, basically mysql's default engine read and write speed is okay (why) the default primary key is a clustered index
  • MyIsam: Does not support transaction write speed but fast read speed. The default is non-clustered index. Although it is stored in b+ trees, it will be slow when storing write values ​​that are not sequential. Look up the previous one
    (is it because it is not auto-incremented as the primary key? (The cost of leaf node splitting is higher when inserting)

mysql architecture:
Insert picture description here

Two, cache:

When the request comes to the database, it is first to query the mysql cache. The cache is divided into the following

  • Buffer pool: Innodb's in-memory cache, used by the client to access the data of the data layer and index layer in the hard disk, will put these data in. This buffer uses the LRU algorithm to give old pages in a ratio of 7:3 Eliminated
  • change buffer: The reason is because the data block and the index block are stored in an idb file, and the secondary index is not ordered and inserted or updated randomly. It is more time-consuming to update the index data when there is an
    insert or When updating, cache these data in the buffer pool and then write a series of index values into the index block when the system is idle or running slowly , which avoids generating a large number of random IO accesses on the disk to obtain ordinary index pages.
  • log buffer: The log buffer is used to save log files to be written to disk. There are several different levels that can be set by innodb_flush_log_at_trx_commit
    1. After the transaction is committed, it is only written to the log buffer and then flushed to the cache every second and then updated to the disk. The unupdated may be lost
    2. Update the log to disk after the transaction is committed
    3. After the transaction is committed, it is written to the page carsh and then updated to the disk every second

Third, index (provide faster search for data stored on the hard disk to provide convenience):

3.1 Index storage structure:

  • Hash table: One of the most important characteristics of hash is that the search is very fast in exchange of space for time. Basically, if a field is added with a hash index = this is more convenient to use because it is not stored sequentially, so the range query does not take the index.
  • B-tree: It is the binary tree plus balance algorithm that all nodes store a keyword
  • B+ tree: The bottom layer of the tree is a binary tree. This is basically a binary search method. This is very fast.
1. 中间节点只保存下个节点的引用 不存放数据 所有的数据都存放在叶子节点中  
2. 叶子节点 之间是以链表链接的  这样可以用作范围查询非常快  
3. 所有中间节点都在子节点中 是子节点中的最大或者最小 

3.2 Index type:

  • Unique index: If a column of data has a unique index, then its data cannot be repeated. Each is unique
  • Primary key index: There can only be one primary key index in a table. The primary key index cannot be repeated and the only one cannot be empty
  • Joint index: Put a few columns in one index. If there are three (a,b,c), create a (a,b) (a,b,c). Follow the left-hand principle. If you only check b, you don’t take the index. But check a is indexed
  • Clustered index: In fact, it is the primary key index above (maybe different from the index is that it is stored in the data block) because there is only one, so when you set the primary key index, it is set to the clustered index by default. At this time, all fields And the primary key are placed on this leaf node

3.3 Search principle:

select * from user where name = '' and age > 15 and age < 30;  
 hash 表: 将name hash 然后直接找  
 B 树 : 查询 age 等于15 的 然后中序遍历 找到结尾 比较耗时  
 B+ 树: 查询age 等于15 的叶子节点 然后通过叶子链表 直接找到最大数   
 user表  id  name  age  height       id 主键索引   (name,age) 联合索引
 select height from user where name = '张三';       
 这个时候 条件name 是有索引的时候(去索引块里面查询对应的值) 查询到叶子节点上面的值name和id 没有height 这个时候回表查询(去数据块里面根据id)根据id走聚集索引 
 查询到所有的数据将height 返回 

Four, storage

Description of file extensions involved in storage:

  • frm: Usually the structure information of this table will be stored in this place after creating a table.
  • idb: After the table is created, the data and indexes of the table should all exist in this place.
  • iddata: file of system tablespace

4.1 Table space:

  • System table space: before 5.7 all data files are placed in the system table space
  • Independent table space: This was introduced after 5.7. All tables created by users are stored in the independent table space data using that idb.
    • The disadvantage is that each table allocates the size first and other tables cannot be enjoyed. This may cause a waste of space
  • General table space: A general table space is to store shared tables or data, which can store data from multiple tables
  • undo tablespace: as the name implies, undo log is placed. undo log can be stored in one or more undo tablespaces instead of the system tablespace.
  • Temporary table space: It is created when the service is started, and destroyed when it is shut down. It is generally used when creating a temporary table during UNION related queries.

4.2 idb file structure

[External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-qDis4iDZ-1611556632291)(../image/mysql database storage.png)]

  • Page storage: It is a unit smaller than the end and area and the smallest unit of mysql storage. The specific data format can be seen in the picture. There are many rows in the middle.
  • Row storage: The row storage data is placed in the middle, but the row storage seems to have the row modes COMPACT and DYNAMIC. The
    row storage can be up to 65535 bytes. If the character set is UTF-8, one character is three bytes, so it will be less.

5. Log

Logs are a very important part of mysql. If we want to understand logs, we must first clarify the server side and storage engine side of the above architecture . When updating a data, the log process is like this.

1. The executor finds the data from the engine, if it is directly returned in the memory, if it is not in the memory, the query returns
2. After the executor gets the data, it will modify the data first, and then call the engine interface to rewrite the data
3. The engine writes the data Update to the memory, the colleague writes data to the redo log. At this time, it is in the prepare stage, and the executor is notified that the execution is complete, and can be operated at any time.
4. The executor generates the bin log of this operation and calls the engine's transaction submission interface
5. Engine Change the prepare phase of the redo log that has just been written to the commit phase, and the update is complete

5.1. Log files of Redo log storage engine:

Vernacular: In order to improve the performance of each modification will not modify the disk so that will slow them off into the buffer zone but insecurity there will be a background process to run the redo log persisted to disk for fault repair
is mainly used For data recovery, every modification and deletion operation will be recorded here. There are two pointers for circular writing. This is the
data written to the disk before the binlog does not commit , so that you can find this to write in when the failure is restored. Failure recovery for binlog not submitted is composed of two steps

  • Redo log buffer log buffer is located in the memory structure and not in the disk structure
  • The redo log file is in the disk structure id_logfile1 id_logfile2

5.2, bin log server log file

In the vernacular: Write all logic into a binary file in an append form. It
is a binary file that records all changes and is mainly used to record time points. SQL is updated and deleted. It is mainly used

  1. Master-slave synchronization master server sends this to the slave server
  2. Synchronization from the server can also be used for restore

5.3, undo log The rollback log of the storage engine

In the vernacular: Store the modified data and version number in the header file of the data so that if an error occurs, it can be rolled back directly.
When the transaction is not committed, each SQL execution will generate an undo log and place it in the DATA_ROLL_PTR of the row record. If it is deleted Is directly placed in the row record header information and marked as deleted. When the
transaction is committed, the undo log is deleted

5.4. What is the difference between redo log and bin log?

1. Redo Log is unique to InnoDB engine, and binlog is implemented by MySQL Server layer, all engines can be used.
2. Redo Log files are written cyclically, the space will be used up, and the binlog log is written additionally and will not overwrite the previous log.
3.Redo log is a physical log, which records what operations are done on a certain data page, and bin log is a logical log, which records the original logic of this statement

Six, affairs

The principle of transactions is achieved through logs and locks , the most important of which is consistency, and the other three are for final consistency.

6.1 Characteristics of the transaction

  • Atomicity: A group of transactions either succeed or fail at the same time, the principle is that undo log rollback if it fails
  • Isolation: Four isolation levels are defined as a trade-off between reliability and performance. The default isolation level is repeatable read. The principle is to use locks to ensure that data will not be changed by other transactions.
  • Persistence: Once the transaction is committed, it will be permanently saved to the database without power failure. The principle is that the transaction is synchronized to the disk through redo log after commit, so it is realized through redo log
  • Consistency: A transaction always transfers from one consistency state to another consistency state, for example, it starts with A+B 5000 and ends with A+B 5000 regardless of how the transfer is made. It is completed by atomicity, isolation, and durability.

6.2 Isolation level of transactions

In the undo log above, this record is generated on the row record when the transaction is not committed, but when multiple transactions are operated together, there will be several kinds of concurrent transactions.

  • Read uncommitted: It means that a transaction can read the data of another uncommitted transaction.
  • Read committed: It means that a transaction must wait until another transaction is committed to read data.
  • Repeatable read (repeatable read): When this transaction is opened, other transactions are not allowed to modify data. Others can be read but cannot be modified. Generally, row locks are added.
  • SERLALIZABLE (Serializable): No matter how many transactions, all sub-transactions of one transaction can be executed after all sub-transactions of one transaction are executed one by one.

6.3 Transaction lock

  • Shared lock (read lock): Add lock in share mode after sql. After this lock is added, other shared locks can be read and other exclusive locks are not allowed;
  • Exclusive lock (write lock): SQL is followed by for update. After adding an exclusive lock, other transactions cannot use read locks or write locks.
    Exclusive locks are divided into row locks and table locks. Row locks lock this row. Table locks are locked. Read and write operations of other users of the entire table.

Q: How can you achieve repeatable reading?

Repeatable read means that the data read multiple times is the same. There are two ways to achieve it, one is to achieve through read lock, the other is to achieve through MVCC
Insert picture description here

Why can it be repeatable? As long as the read lock is not released, the data read for the first time can still be read during the second read.

  • Advantages: simple to implement

  • Disadvantages: unable to achieve parallel reading and writing
    Insert picture description here

    Why can it be repeatable? Because multiple reads generate only one version, the same data is naturally read.

  • Advantages: parallel reading and writing

  • Disadvantages: high implementation complexity

Q: Two things operate on the same record?

​ At this time, it depends on whether it is read or write. If it is read, both things read the data before the snapshot. If it is changed, then this row will be locked and the transaction can be operated after the other is submitted.

How to write a statement to sql update t set a=10 where id=1;
first you can link to the mysql client through the innodb engine architecture to verify the user name and password query cache whether there is an id of 1 that does not hit the cache (each query After that, the previous data will be placed in the
buffer pool.) Analyzer -> Optimizer -> executor and then the underlying engine interacts with
the database. This place is compared to the innodb search engine first querying the id 1 after it is found Set a to 10, call the write interface to update the data to the memory redo log and
record a record in the bin log, and then call the transaction interface to write the data to the disk

Q: Why not select * query

First, using select * queries to a lot of unnecessary data caused by network congestion io
followed mysql is a 16kb memory system is 4k read all of this query might be read several times the

Q: What is the difference between varchar and char, why use varchar(255) and what is the maximum length of varchar()

​ The maximum length of VARCHAR = (the maximum row size-the number of bytes occupied by the NULL identification column-the number of bytes for the length identification) / the maximum number of bytes for a single character in a character set
​ varcchar is a variable-length string that will not be filled if it exceeds the stored length The length
is stored in the number of digits in one byte. Char is a string. If it exceeds the stored length, it is filled with spaces. The
maximum length of each line is 65535 bytes.
If the line mode is used, compact is used. 768 If the byte is used as an index, it needs to be placed in

Q: How does mysql store large fields? Why does varchar() use 255?

​ In the COMPACT row mode, for large field storage, the first 768 fields are placed on the index page record, and the other parts are separated by 20 bytes to point to the remaining positions.
However, a character is 3 bytes, so divided by 3 is 256 One byte is the storage length.
The second argument is that more than 256 bytes require two bytes to store the length, so it is basically 255. This saves a little memory

Q: Do you know what is covering index and return table?

​ When you query a statement, the fields it needs for the query are all in this index, then call it a covering index. At this time, you don't need to return to the table to query. If you don't include all, you need to pass the clustered index according to the id. round-table query

Q: What do you know about mysql locks?

​ Read-write lock

Q: How big is your data? How to do sub-database sub-table?

​ When the amount of data in a table is large, it will definitely consider sub-tables. Generally, the table is divided into tables by time period and sub-fields (add some large fields to another table). If the query cross-table is combined Intermediate table query

Q: How to ensure unique id after sub-table?

Id is generally distributed service guarantees a unique snow algorithm can also specify a unique id field in each table, such as the order number as the primary key

Q: Why does myisam read faster than innodb?

​ If it is a non-primary key lookup, Innodb finds it and needs to go back to the table to look up, but the direct leaf node of myisam is offset. The direct physical address is faster. Innodb supports transactions. It is also a reason to maintain mvvc.

Q: count(1) count(*) count(id) count(field) which is faster

Count(1) and count(*) These two are the same to get the value in the column, but to count count(id), take the id of this column if the value is not empty, count count(field) Find according to this field If this field is not empty, count the
execution efficiency comparison: count(field) <count(primary key) <count(1)

Q: Is there more indexes, the better? Why are there any standards for indexing?

When there are too many indexes, query and update will slow down. If the primary key index is not regular, the index may be rearranged every time it is updated and inserted, so generally the index should not exceed eight

Guess you like

Origin blog.csdn.net/hgdzw/article/details/113111945