[MySQL study notes (fourteen)] introduction to the buffer pool

This article is published by the official account [Developing Pigeon]! Welcome to follow! ! !


Old Rules-Sister Town House:

一. Buffer Pool

(I. Overview

       For InnoDB, in order to cache pages in the disk, a piece of contiguous memory called the buffer pool is applied to the operating system, and the default is 128MB.

(2) Internal composition

       The Buffer Pool is divided into several pages. The page size is the same as the page size used by the InnoDB table space. The default is 16KB. These pages are called buffer pages. In order to better manage these buffer pages, each buffer page is configured with a control block, which stores some control information, such as the table space number to which the page belongs, page number, etc., the control block and buffer pages are in the buffer pool , The control block is placed at the front of the buffer pool, and the buffer page is placed at the back of the buffer pool.


(Three) free linked list

       Since buffer pages are used to cache disk pages, some buffer pages have been used, and some buffer pages have not been used. We need to quickly find free buffer pages, so put the control blocks corresponding to all the buffer pages in the space To a free list. The base node of the linked list contains the address of the head node, the address of the tail node, and the number of nodes in the current linked list, and each of the remaining control blocks contains the front and back pointers of the linked list.


(4) Determine that the buffer page exists in the buffer pool

How to judge whether a buffer page exists in the buffer pool?
       We all locate a page through the table space number + page number, so the table space number + page number is used as the key of a hash table, and the address of the buffer page control block is used as the value. In this way, we can quickly determine whether a buffer page exists in the buffer pool, if not, select a free buffer page from the free linked list, and load the corresponding page in the disk into the buffer page.


(5) Flush linked list

       If we modify the data of a buffer page in the buffer pool, it will be inconsistent with the page on the disk. This kind of page is called a dirty page. If we flush to the disk once after modifying it, the performance will be affected, so we choose to store these dirty pages in a linked list, which is called a flush linked list.


(6) LRU linked list

1. Simple LRU

       The size of the buffer pool is limited. If the size of the page that needs to be cached exceeds the size of the buffer pool, some old buffer pages will be removed from the buffer pool and new pages will be put in. In order to improve the hit rate of the buffer pool, we need to eliminate some of the buffer pages that have been rarely used recently. This requires the creation of a linked list and the elimination of buffer pages according to the principle of least recently used, so this linked list is called LRU (Least Recently Used) linked list .

       When a page needs to be accessed, if the page is not in the buffer pool, when the page is loaded from the disk to the buffer page in the buffer pool, the control block corresponding to the page is used as a node and stuffed into the head of the LRU linked list. In this way, the tail of the LRU linked list is the least recently used buffer page. When the free buffer pages in the buffer pool are used up, some buffer pages at the end of the LRU linked list can be eliminated.


2. LRU divided into areas

       In some cases, a simple LRU will cause performance loss. For example, when a full table scan is performed on a very large table, all the buffers in the buffer pool may be replaced, and the frequently used pages will be removed from the buffer pool. Elimination in the middle; or the pages loaded into the buffer pool are not necessarily used, which will cause the buffer pool hit rate to decrease.

       Therefore, the LRU linked list is divided into two parts. One part stores the buffer pages with a high frequency of use, called hot data, and the other stores the buffer pages with a low frequency of use, called cold data. The ratio of the two parts is not half and half, which can be found by querying the system variable innodb_old_blocks_pct. When a page on the disk is loaded into a buffer page in the buffer pool for the first time, the control block corresponding to the buffer page will be placed at the head of the cold data area, which will not affect the frequently used buffer pages in the hot data area. If the buffer page is accessed for the second time, move its control block to the head of the hot data. For hot data, it may be accessed frequently, and there is no need to frequently move the hot data node to the head. Only when it is located behind 1/4 of the hot data area, will it be moved to the head of the LRU linked list.

Guess you like

Origin blog.csdn.net/Mrwxxxx/article/details/113865992