Analysis of the working principle of the buffer pool

Ⅰ. Introduction of buffer pool

Innodb storage engine buffer pool (buffer pool), similar to oracle's sga, which contains data pages, index pages, change buffers, adaptive hashes, locks (before 5.5), etc.

In summary:

  • Each read and write data is passed through the Buffer Pool
  • When there is no data required by the user in the Buffer Pool, go to the hard disk to obtain it
  • Set the total capacity through innodb_buffer_pool_size, the larger the value, the better

Ⅱ. Buffer pool performance problems

2.1 Linear scaling of performance

Assuming that the server has 72 cores and 144 logical cores after ht hyperthreading, it is reasonable to run the test with 144 cores running at full capacity. If the running test is not satisfied, it means that there is a bottleneck in concurrency.

Before 5.1, this problem was often complained about, but now this problem does not exist

If there are 65536 pages under the 1G space, to manage these pages, the bp must be locked every time. If the bp is too large, there will be a bottleneck. The lock mentioned here is the bp latch, and the database lock is not. one thing

When the qps reaches 1w, it is necessary to obtain at least 1w latches per second (just look at the latch of bp, not to mention the release and wake-up of the latch), and the overhead is relatively large.

There are many cores, the latch or concurrency design is not good, and the performance cannot be linearly expanded, and this bp is very important for scalability. All hot pages are in it, and every time these pages are accessed, the bp latch must be obtained.

2.2 How to improve the above buffer pool performance issues

Adjust the innodb_buffer_pool_instances parameter and set it to the number of cpus

Default 5.5 is 1, 5.6 and 5.7 are 8

Suppose the value was 1 at the beginning, and now it is adjusted to 4. Originally 1 bp manages 65536 pages, now 4 bps, each bp manages 16384 pages, split into 4 shards, the hot spots are scattered, and the latches are reduced. , the concurrency performance has been improved, which is a very common method for concurrency tuning at the kernel layer. After testing, the performance difference between no adjustment and adjustment is 30%.

tips:

When setting multiple buffer pools, each pool must be greater than 1G to take effect. Otherwise, even if innodb_buffer_pool_instances is set in my.cnf, it is useless to restart and see.

Ⅲ. Management of hot data in buffer pool

3.1 Composition of buffer pool

The hot spots in the buffer pool are managed in units of pages. It is not that the sum of the three lists equals the total bp size, but the Free List + LRU List (Flush List is included in the LRU list)

  • Free List put blank page

When the buffer pool is just started, there are 16K blank pages, and these pages are stored (linked list concatenation) in the Free List

  • LRU List includes LRU and unzip_LRU

When reading a data page, take a page from the Free List, store the data, and read the page into the LRU List

When Free List gives a page to LRU List, a concurrency control is required in this process, that is, the latch mentioned earlier. If two threads read this page on disk now, they both need to ask Free List to apply for a free page. , whoever comes first, the latch is for concurrent control access to these three Lists

  • Flush List contains dirty pages (pages with modified data but not flushed to disk), sorted according to oldest_lsn

Assuming that the page that is read is updated immediately, this page is called a dirty page and will be put into the Flush List list, but only a pointer is placed, not the actual page (as long as it has been modified, it will be placed, no matter what modified several times)

How to view dirty pages in buffer pool?

SELECT
    pool_id,
    lru_position,
    space,
    page_number,
    table_name,
    oldest_modification,
    newest_modification
FROM
    information_schema.INNODB_BUFFER_PAGE_LRU
WHERE
    oldest_modification <> 0
        AND oldest_modification <> newest_modification;

结果集为空,则表示没有脏页,线上小心,不要乱执行,此sql消耗比较大

tips:

What is stored in the Flush list is not a page, but a page pointer (page number)

summary:

The LRU List stores all the used pages, including both clean pages and dirty pages. There are only pointers to dirty pages in the Flush List.

3.2 View the status of the buffer pool

Method 1: show engine innodb status\G

...
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 137428992
Dictionary memory allocated 303387
Buffer pool size   8192     #缓冲池中共8192个page
Free buffers       7772         #空白页(Free List),线上很可能是0
Database pages     420      #在使用的页(LRU List)
Old database pages 0        #LRU中教冷的page
Modified db pages  0        #脏页
Pending reads      0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 0, not young 0
0.00 youngs/s, 0.00 non-youngs/s    #youngs表示old变为new
Pages read 368, created 52, written 322
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
No buffer pool page gets since the last printout
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 420, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
...

如果设置了多个buffer pool
找到individual buffer pool info看每一个bp的情况

Method 2: Look at two metadata tables

Let me talk about it first, this thing is relatively large, it does not look very convenient, and it is not recommended.

root@localhost) [(none)]> SELECT
    ->     pool_id,
    ->     pool_size,
    ->     free_buffers,
    ->     database_pages,
    ->     old_database_pages,
    ->     modified_database_pages
    -> FROM
    ->     information_schema.innodb_buffer_pool_stats\G
*************************** 1. row ***************************
                pool_id: 0
              pool_size: 8192
           free_buffers: 7772
         database_pages: 420
     old_database_pages: 0
modified_database_pages: 0
1 row in set (0.00 sec)

(root@localhost) [(none)]> SELECT
    ->     space, page_number, newest_modification, oldest_modification
    -> FROM
    ->     information_schema.innodb_buffer_page_lru
    -> LIMIT 1\G
*************************** 1. row ***************************
              space: 0
        page_number: 7
newest_modification: 5330181742     #该页最近一次(最新)被修改的LSN值
oldest_modification: 0          #该页在Buffer Pool中第一次被修改的LSN值,FLush List是根据该值进行排序的,该值越小,表示该页应该最先被刷新
1 row in set (0.01 sec)

3.2 Analysis of LRU Algorithm

The midpoint LRU algorithm is used in MySQL to manage the LRU List

  • When the page is read for the first time, place the page at the mid point (because there is no guarantee that it must be active)
  • When it is read for the second time, the changed page is put into the header of the new page
  • The innodb_old_blocks_pct parameter controls the position of the mid point, the default is 37, which is the position of 3/8

3.3 Anti-pollution of buffer pool

There is a scenario where a page is swept n times at once, but it is not a hot page. At this time, if you follow what I said before, this page will be put into new, which will pollute the buffer pool.

When will a page be read n times per second?

When scanning, select * from tb_name; if there are 10 records in this page, this page will be read 10 times

We can solve this problem by fixing a page at the midpoint position for a certain amount of time

set global innodb_old_blocks_time=1;

通常 select * 扫描操作不会高于1秒,一个页很快就被扫完了

No matter how many times it is read, it doesn't matter within the time of innodb_old_blocks_time (it is considered to be read only once), and when this time has passed (time is up), if the page is still read, the page will be placed on the new page If it is set to 0, it means that the second time it is read, it will be put into new

If the development has a scan operation, you need to set it up, and then change it back after the operation. The best solution is to put it on the slave to avoid the scan statement polluting the LRU

tips:

①If 10 records in a page are read at a time, the page will be locked as read-only when reading these ten records, then other threads are not allowed to operate on this page, the database is a concurrent system, this is not It is reasonable to read a page in this way and hold the lock for a long time, so read a page every time a record is read, and then release it immediately, and put the read position —— the cursor (this cursor and the cursor of the database are not the same When you want to read again next time, continue reading from opening the cursor, but the position may change, so you will read this page again to ensure fair scheduling of each thread

②myisam cache data is handed over to the operating system cache, the same as pg

3.4 Warm-up of buffer pool

background:

After MySQL is started (before MySQL 5.6), the data of the pages in the Buffer Pool is empty, and it takes a lot of time to read the pages from the disk into the memory, resulting in poor performance for a period of time after startup

Example: load at startup

64GB BP 10M/s read 100min

Preheating strategy: Dump the LRU list and preheat 50M~200M by reading sequentially

Preheating method:

select count(1) from table force index(primary)

select count(1) from index

illustrate:

The above two methods are very acne. It does not preheat the real hotspot data, but just reads in the data. The granularity is very coarse. For example, if your data is 100G and bp10G, most of the real hotspots are not hotspot data.

Netease tried to share memory, the database restart bp is not clear, but the operating system restart is useless

good idea:

MySQL 5.6 has a way

([email protected]) [(none)]> show variables like 'innodb_buffer_pool%';
+-------------------------------------+----------------+
| Variable_name                       | Value          |
+-------------------------------------+----------------+
| innodb_buffer_pool_chunk_size       | 134217728      |
| innodb_buffer_pool_dump_at_shutdown | ON             |    #在停机时dump出buffer pool中的(space,page)
| innodb_buffer_pool_dump_now         | OFF            |    #set一下,表示现在就从buffer pool中dump
| innodb_buffer_pool_dump_pct         | 25             |    #dump的bp的前百分之多少,是每个buffer pool最近使用的页数,而不是整体,可写到[mysqld-5.7]中
| innodb_buffer_pool_filename         | ib_buffer_pool |    #dump出的文件的名字
| innodb_buffer_pool_instances        | 1              |
| innodb_buffer_pool_load_abort       | OFF            |
| innodb_buffer_pool_load_at_startup  | ON             |    #启动时加载dump的文件,恢复到buffer pool中
| innodb_buffer_pool_load_now         | OFF            |    #set一下,表示现在加载 dump的文件
| innodb_buffer_pool_size             | 1879048192     |
+-------------------------------------+----------------+
10 rows in set (0.00 sec)
  • Dump out the space and page_no in bp before closing the database (not the entire bp, it was all dumped when 5.6 was not officially released)
  • When restarting, the dumped content will be loaded into bp. The dumped output is out of order. Before the load, it is sorted according to space and pageno. The load is asynchronous, and the return speed is ok, which basically has no effect on bp.
  • The more dumps, the slower the startup
  • Frequent dumping will lead to less and less data in the Buffer Pool, because innodb_buffer_pool_dump_pct is set, the default is 25, and the total used by Jiang is 40
  • If you do high availability, you can dump regularly, then transfer the dumped file to the slave, and then load it directly (the (Space, Page) on the slave is roughly the same as the one on the Master)

A simple demonstration:

(root@localhost) [(none)]> set global innodb_buffer_pool_dump_now = 1;
Query OK, 0 rows affected (0.00 sec)

(root@localhost) [(none)]>  show status like 'Innodb_buffer_pool_dump_status';
+--------------------------------+--------------------------------------------------+
| Variable_name                  | Value                                            |
+--------------------------------+--------------------------------------------------+
| Innodb_buffer_pool_dump_status | Buffer pool(s) dump completed at 180302 16:57:45 |
+--------------------------------+--------------------------------------------------+
1 row in set (0.00 sec)

进入数据目录
[root@VM_0_5_centos data3306]# ll *pool
-rw-r----- 1 mysql mysql 604 Mar  2 16:59 ib_buffer_pool
[root@VM_0_5_centos data3306]# head ib_buffer_pool
0,568
0,567
0,566
0,565
0,278
0,564
0,563
0,562
164,3
164,2

停止服务
[root@VM_0_5_centos data3306]# mysqld_multi stop 3306
截取错误日志
2018-03-02T09:01:10.292549Z 0 [Note] InnoDB: Starting shutdown...
2018-03-02T09:01:10.392851Z 0 [Note] InnoDB: Dumping buffer pool(s) to /mdata/data3306/ib_buffer_pool
2018-03-02T09:01:10.393059Z 0 [Note] InnoDB: Buffer pool(s) dump completed at 180302 17:01:10

启动服务,加载热数据
[root@VM_0_5_centos data3306]# mysqld_multi start 3306
(root@localhost) [(none)]> set global innodb_buffer_pool_load_now = 1;
Query OK, 0 rows affected (0.00 sec)

再截取错误日志
2018-03-02T09:06:40.526294Z 0 [Note] InnoDB: Loading buffer pool(s) from /mdata/data3306/ib_buffer_pool
2018-03-02T09:06:40.526487Z 0 [Note] InnoDB: Buffer pool(s) load completed at 180302 17:06:40

tips:

Pay attention to the parameter innodb_buffer_pool_dump_pct, first look at the following process

(root@localhost) [(none)]> set global innodb_buffer_pool_dump_pct=100;
Query OK, 0 rows affected (0.00 sec)

(root@localhost) [(none)]> set global innodb_buffer_pool_dump_now = 1;
Query OK, 0 rows affected (0.00 sec)

[root@VM_0_5_centos data3306]# cat ib_buffer_pool |wc -l
576

(root@localhost) [(none)]> set global innodb_buffer_pool_dump_pct=20;
Query OK, 0 rows affected (0.00 sec)

(root@localhost) [(none)]> set global innodb_buffer_pool_dump_now = 1;
Query OK, 0 rows affected (0.00 sec)

[root@VM_0_5_centos data3306]# cat ib_buffer_pool |wc -l
115

It doesn't seem to be a problem, but it should be noted that when you have multiple buffer pools, such as 4, each with 100 pages, it is not the first 25% of the overall dump, but the dump every time The first 15 pages in the buffer pool

Ⅳ. Asynchronous reading

It is found that the full table scan, if a part of the content has been scanned, innodb will asynchronously read the latter part of the content, even if you have not read it, there are two cases of asynchronous reading, as follows:

随机预读
innodb_random_read_ahead
线性预读
innodb_read_ahead_threshold    该参数目前缺省值为0
  • Linear read ahead is placed in extents, while random read ahead is placed in pages in extents
  • Linear read ahead is to read the next extent into the buffer pool in advance, and random read ahead is to read the remaining pages in the current extent to the buffer pool in advance

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325295022&siteId=291194637