MySql Innodb Storage Engine--Architecture and Engine Introduction

 

 

Mysql architecture diagram

1 Connectors refer to the interaction with SQL in different languages

 

2 Management Services & Utilities: System management and control tools

 

3 Connection Pool: Connection pool.

Manage buffered user connections, thread processing, etc. that require caching

 

4 SQL Interface: SQL interface.

Accept the user's SQL command, and return the result that the user needs to query. For example, select from is to call SQL Interface

 

5 Parser: Parser.

SQL commands are validated and parsed by the parser when passed to the parser. The parser is implemented by Lex and YACC and is a long script.

The main function:

a. Decompose the SQL statement into a data structure, and pass this structure to the subsequent steps. The subsequent transmission and processing of SQL statements are based on this structure 

b. If an error is encountered in the decomposition, it means that the sql statement is unreasonable

 

6 Optimizer: Query optimizer.

The SQL statement will use the query optimizer to optimize the query before querying. He uses the "select-project-join" strategy to query.

It can be understood with an example: select uid,name from user where gender = 1;

This select query first selects according to the where statement, instead of querying all the tables first and then performing gender filtering

This select query first performs attribute projection based on uid and name, instead of filtering all attributes after taking them out

Join these two query conditions to generate the final query result

 

7 Cache and Buffer: Query cache.

If the query cache has a hit query result, the query statement can directly go to the query cache to fetch data.

This caching mechanism consists of a series of small caches. Such as table cache, record cache, key cache, permission cache, etc.

 

8 Engine: storage engine.

The storage engine is a specific subsystem in MySql that deals with files. It is also one of the most distinctive places of Mysql.

Mysql's storage engine is plug-in. It customizes a file access mechanism based on an abstract interface of the file access layer provided by MySql AB (this access mechanism is called a storage engine)

Now there are many kinds of storage engines, the advantages of each storage engine are different, the most commonly used MyISAM, InnoDB, BDB

By default, MySql uses the MyISAM engine, which has fast query speed, better index optimization and data compression technology. But it doesn't support transactions.

InnoDB supports transactions and provides row-level locking, which is widely used. 

Mysql also supports its own custom storage engine, and even different tables in a library use different storage engines, which are all allowed.

 

 

 

InnoDB Architecture Diagram


 

Introduction to background threads:

1. Master Thread The Master Thread is a very core background thread, which is mainly responsible for asynchronously flushing the data in the buffer pool to the disk to ensure data consistency, including the flushing of dirty pages, the merge insert buffer (INSERT BUFFER), and the rollback page ( UNDO PAGE) recovery, etc.

2. IO Thread uses AIO (Async IO) extensively in the InnoDB storage engine to process IO requests, which can greatly improve the performance of the database. The work of IO Thread (insert buffer thread, log thread, read thread, write thread) is mainly responsible for the callback (call back) processing of these IO requests

3. There will be multiple buffer pools, the default is 8, and multiple pools can increase the concurrent processing capacity of the database

4. The buffer pool uses the LRU algorithm. After modification, the newly read page is not directly placed in the LRU header, but is placed in the position of 3/8. This value can be passed through

innodb_old_blocks_pct setting, this value defaults to 37, which means that the newly read value is placed at the position of LRU37%

Because some operations (such as traversal) are only one-time, if this page is placed in the LRU header every time, it will affect the normal hot page

When the page is read for a certain period of time, it will be placed in the LRU header, and this time is parameterized

controlled by innodb_old_blocks_time

 

 

redo log

Mysql will have two files by default: ib_logfile0 and ib_logfile1, these two files are redo log files, or transaction logs.

Purpose of the redo log: In the event of an instance or media failure, the redo log file can come in handy.

Each InnoDB storage engine has at least one redo log file group, and each file group has at least two redo log files, such as the default ib_logfile0 and ib_logfile1. The InnoDB storage engine first writes redo log file 1. When the end of the file is reached, it will switch to redo log file 2. When redo log file 2 is also full, it will be switched to redo log file 1. .

Parameters affecting redo logs:

Innodb_log_file_size, innodb_log_files_in_group, innodb_log_group_home_dir affect the attributes of redo log files.

The undo log is a log in which the reverse operation of the database is recorded.

If the content of the database is regarded as a kind of state machine, then the data write operation is the command to modify the state machine, and undo corresponds to the reverse command to modify the state machine.

Therefore, in theory, each command to modify the state machine will generate a corresponding undo log, so that when the transaction is rolled back, the state machine can be modified to the original state of the transaction.

Contrary to Undo Log, Redo Log records a backup of new data. Before the transaction is committed, as long as the Redo Log is persisted, the data does not need to be persisted. When the system crashes, although the data is not persisted, the Redo Log has been persisted. The system can restore all data to the latest state according to the content of the Redo Log. 

 

 

Checkpoint mechanism

1. The main thread periodically flushes some pages to disk

2. There are not enough free pages in the LRU queue. The parameter innodb_lru_scan_depth controls the number of available pages in the LRU list

3. Not enough redo logs

4. Too much dirty, the parameter innodb_max_dirty_pages_pct controls the dirty ratio, the value is 75%

 

 

insert buffer

The insert buffer is not a part of the cache, but a physical page. For the insert or update operation of a non-clustered index, it is not directly inserted into the index page every time. Instead, it is first judged whether the inserted non-clustered index page is in the buffer pool. , then insert directly, if not, put it into an insert buffer first. Then perform the merge operation of insert buffer and non-clustered index page child nodes at a certain frequency. Conditions of use: non-clustered index, non-unique

因为主键肯定都是顺序的,唯一索引插入的时候要先检查一下(肯定有一个随机IO),对于非唯一所以只要插入就行了,而这个插入可能会产生随机IO,所以insert buffer的原理就是将多次随机IO合并,用顺序IO替代随机IO

-- 看看合并操作节省了多少IO请求,(1034310+3)/113909=9.08
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 134, seg size 136, 113909 merges
merged operations:
 insert 3, delete mark 2319764, delete 1034310
discarded operations:
 insert 0, delete mark 0, delete 0
Hash table size 288996893, node heap has 304687 buffer(s)
1923.58 hash searches/s, 1806.60 non-hash searches/s

对于SSD这种优化随机IO的方式可能就不需要了

mysql高版本又加了一个增强的change buffer,支持delete,update等操作原理跟insert buffer一样 

 

 

double write

 InnoDB 的Page Size一般是16KB,其数据校验也是针对这16KB来计算的,将数据写入到磁盘是以Page为单位进行操作的。而计算机硬件和操作系统,在极端情况下(比如断电)往往并不能保证这一操作的原子性,16K的数据,写入4K 时,发生了系统断电/os crash ,只有一部分写是成功的,这种情况下就是 partial page write 问题。

很多DBA 会想到系统恢复后,MySQL 可以根据redolog 进行恢复,而mysql在恢复的过程中是检查page的checksum,checksum就是pgae的最后事务号,发生partial page write 问题时,page已经损坏,找不到该page中的事务号,就无法恢复。

double write架构

Innodb_dblwr_pages_written    写入多少次double write

Innodb_dblwr_writes                   实际写入的次数

这两个参数通过 SHOW STATUS LIKE 'innodb%';    查看

如果需要更快的性能,或者文件系统本身就提供部分写失效的问题,可以将双写关闭

通过参数 skip_innodb_doublewrite  设置

 

 

自适应hash

自适应哈希索引采用之前讨论的哈希表的方式实现,不同的是,这仅是数据库自身创建并使用的,DBA本身并不能对其进行干预。自适应哈希索引近哈希函数映射到一个哈希表中,因此对于字典类型的查找非常快速,如SELECT * FROM TABLE WHERE index_col='xxx'但是对于范围查找就无能为力。通过SHOW ENGINE INNODB STATUS 可以看到当前自适应哈希索引的使用情况

-- 这里显示了每秒使用自适应hash的次数,以及没有用到hash的次数
Hash table size 4425293, node heap has 1337 buffer(s)
174.24 hash searches/s, 169.49 non-hash searches/s

参数   innodb_adaptive_hash_index 禁用或启动此特性,默认是开启

Innodb还提供了刷新临近页面的功能,这是为了优化传统机械盘的,如果是SSD就不需要了

通过参数 innodb_flush_neighbors 开启或关闭这个功能

 

 

每个池前面有标示符

--- BUFFER POOL 0   显示

---BUFFER POOL 0
Buffer pool size   65528     --当前buffer一共有多少页(一页16K)
Free buffers       48335     --当前缓冲池空闲页
Database pages     16892     --当前缓冲池的LRU页数量
Old database pages 6255
Modified db pages  654
Pending reads      0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 189, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 10029, created 6863, written 172659
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s

Pages made young 是从old变到首部的页数量

not young是因为innodb_old_blocks_time的限制没有变到首部的数量

Buffer pool hit rate 是命中率

youngs/s 和 non-youngs/s  是每秒变到首部和没有变到首部的页数量

LRU len:1539 ,  unzip_LRU len : 156    LRU长度是一共有多少页,unzip表示未压缩的页

information_schema库中

INNODB_BUFFER_POOL_STATUS记录了每个缓冲池的状态

INNODB_BUFFER_PAGE_LRU unzip LRU状态

 

 

 

 

 

 

Mysql引擎相关属性

mysql> show engine innodb status\G
*************************** 1. row ***************************
  Type: InnoDB
  Name:
Status:
=====================================
2017-03-23 10:51:31 0x19f0 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 4 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 1 srv_active, 0 srv_shutdown, 5996 srv_idle
srv_master_thread log flush and writes: 5997
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 34
OS WAIT ARRAY INFO: signal count 20
RW-shared spins 0, rounds 20, OS waits 3
RW-excl spins 0, rounds 174, OS waits 1
RW-sx spins 0, rounds 0, OS waits 0
Spin rounds per wait: 20.00 RW-shared, 174.00 RW-excl, 0.00 RW-sx
------------
TRANSACTIONS
------------
Trx id counter 65283
Purge done for trx's n:o < 58732 undo n:o < 0 state: running but idle
History list length 277
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 281475142321968, not started
0 lock struct(s), heap size 1136, 0 row lock(s)
--------
FILE I/O
--------
I/O thread 0 state: wait Windows aio (insert buffer thread)
I/O thread 1 state: wait Windows aio (log thread)
I/O thread 2 state: wait Windows aio (read thread)
I/O thread 3 state: wait Windows aio (read thread)
I/O thread 4 state: wait Windows aio (read thread)
I/O thread 5 state: wait Windows aio (read thread)
I/O thread 6 state: wait Windows aio (write thread)
I/O thread 7 state: wait Windows aio (write thread)
I/O thread 8 state: wait Windows aio (write thread)
I/O thread 9 state: wait Windows aio (write thread)
Pending normal aio reads: [0, 0, 0, 0] , aio writes: [0, 0, 0, 0] ,
 ibuf aio reads:, log i/o's:, sync i/o's:
Pending flushes (fsync) log: 0; buffer pool: 0
419 OS file reads, 53 OS file writes, 7 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 0, seg size 2, 0 merges
merged operations:
 insert 0, delete mark 0, delete 0
discarded operations:
 insert 0, delete mark 0, delete 0
Hash table size 2267, node heap has 0 buffer(s)
Hash table size 2267, node heap has 0 buffer(s)
Hash table size 2267, node heap has 0 buffer(s)
Hash table size 2267, node heap has 0 buffer(s)
Hash table size 2267, node heap has 0 buffer(s)
Hash table size 2267, node heap has 0 buffer(s)
Hash table size 2267, node heap has 0 buffer(s)
Hash table size 2267, node heap has 0 buffer(s)
0.00 hash searches/s, 0.00 non-hash searches/s
---
LOG
---
Log sequence number 12560144
Log flushed up to   12560144
Pages flushed up to 12560144
Last checkpoint at  12560135
0 pending log flushes, 0 pending chkp writes
10 log i/o's done, 0.00 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 8585216
Dictionary memory allocated 1239320
Buffer pool size   512
Free buffers       254
Database pages     258
Old database pages 0
Modified db pages  0
Pending reads      0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 0, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 382, created 34, written 36
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s

LRU len: 258, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
0 read views open inside InnoDB
Process ID=2056, Main thread ID=2376, state: sleeping
Number of rows inserted 0, updated 0, deleted 0, read 8
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================

1 row in set (0.00 sec)

 

 

Innodb相关的参数

innodb_adaptive_flushing	ON
innodb_adaptive_flushing_lwm	10
innodb_adaptive_hash_index	ON
innodb_adaptive_hash_index_parts	8
innodb_adaptive_max_sleep_delay	150000
innodb_api_bk_commit_interval	5
innodb_api_disable_rowlock	OFF
innodb_api_enable_binlog	OFF
innodb_api_enable_mdl	OFF
innodb_api_trx_level	0
innodb_autoextend_increment	64
innodb_autoinc_lock_mode	1
innodb_buffer_pool_chunk_size	134217728
innodb_buffer_pool_dump_at_shutdown	ON
innodb_buffer_pool_dump_now	OFF
innodb_buffer_pool_dump_pct	25
innodb_buffer_pool_filename	ib_buffer_pool
innodb_buffer_pool_instances	8
innodb_buffer_pool_load_abort	OFF
innodb_buffer_pool_load_at_startup	ON
innodb_buffer_pool_load_now	OFF
innodb_buffer_pool_size	8589934592
innodb_change_buffer_max_size	25
innodb_change_buffering	all
innodb_checksum_algorithm	crc32
innodb_checksums	ON
innodb_cmp_per_index_enabled	OFF
innodb_commit_concurrency	0
innodb_compression_failure_threshold_pct	5
innodb_compression_level	6
innodb_compression_pad_pct_max	50
innodb_concurrency_tickets	5000
innodb_data_file_path	ibdata1:12M:autoextend
innodb_data_home_dir	
innodb_deadlock_detect	ON
innodb_default_row_format	dynamic
innodb_disable_sort_file_cache	OFF
innodb_doublewrite	ON
innodb_fast_shutdown	1
innodb_file_format	Barracuda
innodb_file_format_check	ON
innodb_file_format_max	Barracuda
innodb_file_per_table	ON
innodb_fill_factor	100
innodb_flush_log_at_timeout	1
innodb_flush_log_at_trx_commit	2
innodb_flush_method	
innodb_flush_neighbors	1
innodb_flush_sync	ON
innodb_flushing_avg_loops	30
innodb_force_load_corrupted	OFF
innodb_force_recovery	0
innodb_ft_aux_table	
innodb_ft_cache_size	8000000
innodb_ft_enable_diag_print	OFF
innodb_ft_enable_stopword	ON
innodb_ft_max_token_size	84
innodb_ft_min_token_size	3
innodb_ft_num_word_optimize	2000
innodb_ft_result_cache_limit	2000000000
innodb_ft_server_stopword_table	
innodb_ft_sort_pll_degree	2
innodb_ft_total_cache_size	640000000
innodb_ft_user_stopword_table	
innodb_io_capacity	200
innodb_io_capacity_max	2000
innodb_large_prefix	ON
innodb_lock_wait_timeout	50
innodb_locks_unsafe_for_binlog	OFF
innodb_log_buffer_size	8388608
innodb_log_checksums	ON
innodb_log_compressed_pages	ON
innodb_log_file_size	50331648
innodb_log_files_in_group	2
innodb_log_group_home_dir	./
innodb_log_write_ahead_size	8192
innodb_lru_scan_depth	1024
innodb_max_dirty_pages_pct	75.000000
innodb_max_dirty_pages_pct_lwm	0.000000
innodb_max_purge_lag	0
innodb_max_purge_lag_delay	0
innodb_max_undo_log_size	1073741824
innodb_monitor_disable	
innodb_monitor_enable	
innodb_monitor_reset	
innodb_monitor_reset_all	
innodb_old_blocks_pct	37
innodb_old_blocks_time	1000
innodb_online_alter_log_max_size	134217728
innodb_open_files	2000
innodb_optimize_fulltext_only	OFF
innodb_page_cleaners	4
innodb_page_size	16384
innodb_print_all_deadlocks	OFF
innodb_purge_batch_size	300
innodb_purge_rseg_truncate_frequency	128
innodb_purge_threads	4
innodb_random_read_ahead	OFF
innodb_read_ahead_threshold	56
innodb_read_io_threads	4
innodb_read_only	OFF
innodb_replication_delay	0
innodb_rollback_on_timeout	OFF
innodb_rollback_segments	128
innodb_sort_buffer_size	1048576
innodb_spin_wait_delay	6
innodb_stats_auto_recalc	ON
innodb_stats_method	nulls_equal
innodb_stats_on_metadata	OFF
innodb_stats_persistent	ON
innodb_stats_persistent_sample_pages	20
innodb_stats_sample_pages	8
innodb_stats_transient_sample_pages	8
innodb_status_output	OFF
innodb_status_output_locks	OFF
innodb_strict_mode	ON
innodb_support_xa	ON
innodb_sync_array_size	1
innodb_sync_spin_loops	30
innodb_table_locks	ON
innodb_temp_data_file_path	ibtmp1:12M:autoextend
innodb_thread_concurrency	8
innodb_thread_sleep_delay	0
innodb_tmpdir	
innodb_undo_directory	./
innodb_undo_log_truncate	OFF
innodb_undo_logs	128
innodb_undo_tablespaces	0
innodb_use_native_aio	OFF
innodb_version	5.7.15
innodb_write_io_threads	4

 

一些参数

-- 缓冲区实例个数
SHOW VARIABLES LIKE 'innodb_buffer_pool_instances'

-- 缓冲区大小
SHOW VARIABLES LIKE 'innodb_buffer_pool_size'

  

 

 

 

 

 

参考

Mysql官网文档

Mysql架构介绍

Mysql重做日志

Mysql的Checkpoint机制

Insert buffer插入缓冲区

Insert buffer漫谈

innodb两次写实现解析

我理解的MySql double write

MySql数据库InnoDB存储引擎Log漫游

Mysql双向同步复制

Percona和MariaDB

聚集索引和非聚集索引

淘宝mysql官网

 

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326442276&siteId=291194637