InnoDB basic features

Ⅰ、double write

Purpose: To ensure the reliability of data writing

tips:

What is partial write?

The 16k page is only written to 4k, 6k, 8k, 12k and then disconnected

The corrupt page is that the header of the page is updated but the trailer is not updated

This kind of page cannot be recovered through redo log (the premise of recovery through redo is that the page is clean)

1.1 What is double write?

When the page is updated, it is not directly written to the corresponding ibd, but is written to the double write segment object (in the shared table space) first, and a page is successfully written before writing to the disk (space, corresponding to page_no)

But not every page does this, but 128 pages (two areas), which is 2M (not adjustable)

If a partial write occurs at this time,
there is a copy of a clean page in doublewrite, and the page can be restored.
If doublewrite is partial write, the page in the data file is not damaged or clean, and can be restored by redo

There is always a clean copy in doublewrite and tablespace

Let's take a picture:

1. 将脏页copy到Double Write Buffer对象中,默认2M大小
2. 将Double Write Buffer中的对象先写入到共享表空间(ibdata1)中的Double Write
    2M循环覆盖
    顺序写入(一次IO)
3. 再据(space,page_no)写入到原来的ibd文件中;
4. 如果是在写到ibdata1中的Double Write时,发宕机;此刻原来的ibd file 仍然是完整、干净的 ,下次启动后是可以用redo文件进行恢复的。
5. 如果是写到ibd文件时,发了宕机;此刻在原来的 ibdata1中存在副本 ,可以直接覆盖到ibd文件(对应的页)中去,然后再进行redo进行恢复
redo是物理逻辑的,物理表示记录的日志针对的是页(page)的修改,逻辑表示记录日志的内容是逻辑的。

1.2 Performance overhead

Opening dw has a certain performance overhead, but it is not double, because it only comes once for 128 pages, and it is written sequentially,
usually 5%~25%

Can the slave server be considered shut down?

Logically, it should be unnecessary. The slave should also ensure the reliability of page writing, and the performance overhead is not very large. It is recommended that the master-slave configuration be completely consistent.

([email protected]) [test]> show variables like '%double%';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| innodb_doublewrite | ON    |
+--------------------+-------+
1 row in set (0.01 sec)

If there is no partial write in your system, writing a page is atomic,
you can turn off this parameter

Disk: Fusion-IO

File system: ZFS, btrfs

--skip-doublewrite

The above two file systems are basically unavailable in linux, so it is not recommended to close them

Ⅱ、INSERT/CHANGE BUFFER

  • Before version 5.5 it was called insert buffer, now it is called change buffer
  • Used to improve the insertion (addition, deletion and modification) performance of auxiliary indexes
  • none unique secondary index
  • The insert buffer page is a B+ tree that caches up to 2k records at a time
  • 30% performance improvement when enabled (default enabled)

tips:

The insertion of the secondary index is random (except for time and the like), of course, it is not necessarily a secondary index, the same is true for pk, there is no such problem with self-increment, if a random function is used, it will be random, so All will do a globally unique id, and I hope this id is sequential, because the performance of sequential insertion is better, and the splits of the B+ tree are all split in one direction.

2.1 The working principle of Insert Buffer (space for time)

First determine whether the inserted non-clustered index page is in the buffer pool, if so, insert it directly
. machine will not be a problem)

When the auxiliary index page is read to the buffer pool, the records of the page in the Insert Buffer will be merged into the auxiliary index page. Of course, there are also threads in the background that periodically perform this batch read and merge operation from the Insert Buffer, so it is actually a cache.

2.2 Potential problems

① Before 5.5, a maximum of 1/2 buffer pool memory can be used, and now only 1/4 can be used

([email protected]) [zst]> show variables like '%change_buffer%';
+-------------------------------+-------+
| Variable_name                 | Value |
+-------------------------------+-------+
| innodb_change_buffer_max_size | 25    |
| innodb_change_buffering       | all   |
+-------------------------------+-------+
2 rows in set (0.01 sec)

②The insert performance begins to decline when the insert buffer starts to merge

shutdown does not merge insert buffer records

tips:

Why is the performance of the MySQL full insert operation test, the results are very good at the beginning, gradually getting worse?

The BP is 10G at the beginning, and the BP is empty at the beginning of the insertion, and it is directly inserted . The
back is full of dirty pages, which involves reading the disk and flushing the dirty pages back to the disk. When the
BP is full, the disk needs to be flushed, split

2.3 Insert/Change Buffer example

CREATE TABLE t (
    a INT AUTO_INCREMENT,  -- a 列是自增的
    b VARCHAR(30),         -- b 列是varchar
    PRIMARY KEY(a)         -- a 是主键 key(b)
    -- b 是二级索引(如果是name之类的,可以看成是非唯一的)
);
  • For column a, each time the primary key is inserted, the corresponding clustered index page must be inserted immediately (insert directly in the memory, and supervise the memory first if there is no more memory)
  • For column b, there is no Insert Buffer. Each time a record is inserted, the page must be read (read memory, or monitor memory from disk), and then insert the record into the page. If there is an Insert Buffer, first determine the record corresponds to the one to be inserted. Whether the secondary index page is in bp, if so, insert it directly, otherwise cache it and put it in the Insert Buufer, wait for the secondary index page to be read, and merge (Merge) the records corresponding to the page in the Insert Buffer. This reduces io operations

summary:

To sum up, this is a means of changing space for time (batch insertion) to improve the performance of secondary index insertion. The secondary index can be inserted in no hurry, as long as the primary key has been inserted.

2.4 View Insert/Change Buffer

show engeine innodb status\G
...
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 0, seg size 2, 0 merges
merged operations:
 insert 0, delete mark 0, delete 0
discarded operations:
 insert 0, delete mark 0, delete 0
Hash table size 232499, node heap has 0 buffer(s)
Hash table size 232499, node heap has 0 buffer(s)
Hash table size 232499, node heap has 0 buffer(s)
Hash table size 232499, node heap has 0 buffer(s)
Hash table size 232499, node heap has 0 buffer(s)
Hash table size 232499, node heap has 0 buffer(s)
Hash table size 232499, node heap has 0 buffer(s)
Hash table size 232499, node heap has 0 buffer(s)
0.00 hash searches/s, 0.00 non-hash searches/s
...

Seg size = size + free list len + 1 buffer中页的数量,size为used page,free list len为free page

merges:合并了多少页

merged insert:插入了多少条记录

insert/merges表示插入的效率(插入一条记录,就要读取一次页)

discarded operations:应该是很小的值,或者为0,当记录写入到Insert/Change Buffer后,对应的表被删除了,则相应的Buffer中的记录就应该被丢弃

When using the premise of Insert/Change Buffer, you need to use random IO, and then put it into the Buffer. If the page is already in the Buffer Pool (memory), you don't need to use the Insert/Change Buffer.

2.5 change buffer

After 5.5, it was renamed to change buffer (Insert, Delete-marking, Purge)

--innodb_change_buffering
all
none
inserts
deletes
changes=(insert&delete-marking)
purges
默认all,不要动它,由它去

Ⅲ, ADAPTIVE HASH INDEX (adaptive hash index)

3.1 Search Time Complexity

①B+ tree O(T height), can only locate the page where the record is located
②Hash table O(1), can directly locate the record

Innodb judges by itself whether this page is a hot page, and if so, it creates ahi for the records in the page. This required memory is required by bp, so it will not be oom.

node heap has 0 buffer(s) is the memory required by bp, not if the operating system requires memory, at most it is not enough bp, so it will not allocate and create ahi for you

3.2 What kind of jb stuff is adaptive hashing?

  • Build a hash according to the access mode of the B+ tree, and guess basically no overhead
  • Create a hash index only on records in hotspot pages
  • Non-endurance
  • Only support point query (point select), that is, equivalent query

Open by default, it is recommended to close after the official 5.6

The hash index is the records in the page. If the records in a hot page are updated frequently, the cost of maintaining ahi is very high. Even if it is not a dml operation, just select it and create an index

innodb_adaptive_hash_index            直接设置为off
innodb_adaptive_hash_index_parts      5.6之后推出,对ahi锁进行分片,设置为cpu的个数,当然关了之后这就不谈了

3.3 How to judge hot page?

The process of creating ahi

Three requirements, very demanding

Is the index accessed 17 times?

A page in the index has been accessed at least 100 times

The pattern of access to pages in the index is the same

idx_a_b(a,b)

where a = xxx

where a = xxx and b = xxx

It's useless, the performance doesn't improve after it is turned on, but it consumes more CPU, you can run a sysbench to see

tips:

①5.6 provides a view to see the number of access records (originally in memory, not visible), information_schema>INNODB_BUFFER_PAGE_LRU>FIX_COUNT

②The number of pages of free+lru is not equal to the total, a big reason is that the memory is used by ahi

Ⅳ、FLUSH NEIGHBOR PAGE(FNP)

Attempt to flush all dirty pages in the area where the new page is located, merge io, and reverse the order randomly

Brushing a page may trigger the brushing of 64 pages. The pages in an area are continuous, so refreshing the pages in an area is relatively sequential, but it is too ruthless. The starting point is good, but what should I do if it gets dirty immediately after brushing Woolen cloth?

This feature is valid for traditional mechanical disks, and SSDs turn off this feature (random write performance is good enough, no need to do this random transfer order optimization)

MySQL5.6 --innodb_flush_neighbors={1|0},默认是1,关了关了,可以在线修改的
非ssd5.7可以设置为2表示刷连续的脏页

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325295071&siteId=291194637