MySQL big query problem

We often encounter such problems. My mainframe has only 100G memory. Now I want to perform a full table scan on a large 200G table. Will the memory of the database mainframe be used up? This problem is indeed worth worrying about. Being OOM (out of memory) by the system is not a joke.

But, think about it the other way around. When doing logical backups, isn't it just a whole library scan? If this will eat up the memory, isn't the logical backup already dead? So, doing a full table scan on a large table seems to be fine. But what exactly does this process look like?

The impact of full table scan on the server layer

We already know that InnoDB data is stored on the primary key index, so the full table scan actually scans the primary key index of table t directly. Then every row found can be directly put into the result set, and then returned to the client. In fact, the server does not need to save a complete result set. The process of fetching and sending data is as follows:

  • Get a line and write it in net_buffer. The size of this memory is defined by the parameter net_buffer_length, and the default is 16k.
  • Repeat to get rows until net_buffer is full, call the network interface to send it out.
  • If the transmission is successful, clear the net_buffer, then continue to fetch the next line and write to the net_buffer.
  • If the send function returns EAGAIN or WSAEWOULDBLOCK, it means that the local network stack (socket send buffer) is full and enters waiting. Until the network stack is writable again, continue sending.

In other words, MySQL is "read while sending", this concept is very important. This means that if the client is slow to receive, it will cause the MySQL server to fail to send the results, and the transaction will take longer to execute.

For example, below, we deliberately prevent the client from reading the content in the socket receive buffer, and then see the result on the server's show processlist.

    

If you see that the value of State is always "Sending to client", it means that the server-side network stack is full.

If the client uses the -quick parameter, the mysql_use_result method will be used. This method is to read one line and process one line. You can imagine that if there is a business logic that is more complicated, if the logic to be processed after reading a row of data is slow, it will cause the client to take a long time to fetch the next row of data, and the above situation may occur. .

For normal online business, if a query will not return many results, I suggest you use the mysql_store_result interface to directly save the query results to local memory.

If you see that many threads are in the "Sending to client" state, it means that you have to ask business development students to optimize the query results and evaluate whether so many returned results are reasonable. If you want to quickly reduce the number of threads in this state, setting the net_buffer_length parameter to a larger value is an optional solution.

If the client cannot receive the results due to too much pressure on the client side, MySQL cannot send the results and affect the execution of the statement, but this is not the worst case. The worst is the "long transaction." As for the impact of long transactions, it is necessary to combine the lock and MVCC mentioned in our previous article.

  • If the previous statement is updated, it means that they are occupying the row lock, which will cause other statement updates to be locked;
  • Of course, the read transaction also has a problem, that is, the undo log cannot be recycled, which causes the rollback segment space to expand.

A state that looks very similar to "Sending to client" is "Sending data", which is a problem that is often misunderstood. In fact, the state change of a query statement is like this (note: here, I omitted other unrelated states):

  • After MySQL query statement enters the execution stage, first set the status to "Sending data";
  • Then, send the column related information (meta data) of the execution result to the client;
  • Then continue the process of executing the statement;
  • After the execution is complete, set the status to an empty string.

In other words, "Sending data" does not necessarily mean "data is being sent", but may be at any stage in the actuator process .

Therefore, only when the thread is in the state of "waiting for the client to receive the result" will it display "Sending to client"; and if it is displayed as "Sending data", it just means "executing".

The impact of full table scan on InnoDB

The data pages of the memory are managed in the Buffer Pool (BP), and the Buffer Pool plays a role in accelerating updates in WAL. In fact, Buffer Pool has a more important role, which is to speed up queries. The acceleration effect of Buffer Pool on query depends on an important indicator, namely: memory hit rate. Under normal circumstances, a stable service online system must ensure that the response time meets the requirements, and the memory hit rate should be above 99%.

Execute show engine innodb status, you can see the words "Buffer pool hit rate", which shows the current hit rate.

The size of the InnoDB Buffer Pool is determined by the parameter innodb_buffer_pool_size. Generally, it is recommended to set it to 60%~80% of the available physical memory. If a Buffer Pool is full and a data page needs to be read from the disk, then an old data page must be eliminated. InnoDB memory management uses the Least Recently Used (LRU) algorithm. The core of this algorithm is to eliminate the longest unused data.

InnoDB's LRU algorithm for Buffer Pool management is implemented with a linked list, and the data page that has just been accessed recently will be placed at the head of the linked list. If the data page accessed at a certain time does not exist in the linked list, a new data page P needs to be applied for in the Buffer Pool, but because the memory is full, no new memory can be applied for. As a result, the memory of the data page at the end of the linked list is cleared, new content is stored, and then placed at the head of the linked list.

This algorithm seems to be no problem at first glance, but if you consider to do a full table scan and scan according to this algorithm, all the data in the current Buffer Pool will be eliminated and stored in the data page accessed during the scan. Content. That is to say, the data of this historical data table is mainly placed in the Buffer Pool. For a library that is doing business services , this is not good. You will see that the memory hit rate of the Buffer Pool drops sharply, the disk pressure increases, and the SQL statement response slows down.

In fact, InnoDB has improved the LRU algorithm:

    

In the implementation of InnoDB, the entire LRU linked list is divided into the young area and the old area in a ratio of 5:3. In the figure, LRU_old points to the first position of the old area, which is 5/8 of the entire linked list. In other words, 5/8 near the head of the linked list is the young area, and 3/8 near the end of the linked list is the old area. The improved LRU algorithm execution process becomes

  1. In state 1, the data page P3 is to be accessed. Since P3 is in the young area, it is moved to the head of the linked list as in the LRU algorithm before optimization and becomes state 2.
  2. After that, a new data page that does not exist in the current linked list is to be accessed. At this time, the data page Pm is still eliminated, but the newly inserted data page Px is placed at LRU_old.
  3. The data page in the old area must be judged as follows every time it is accessed:
  • If this data page exists in the LRU linked list for more than 1 second, move it to the head of the linked list;
  • If this data page exists in the LRU linked list for less than 1 second, the position remains unchanged. The time of 1 second is controlled by the parameter innodb_old_blocks_time. The default value is 1000, in milliseconds.

This strategy is tailored to handle operations similar to full table scans. It can be seen that the biggest benefit of this strategy is that in the process of scanning this large table, although the Buffer Pool is also used, it has no effect on the young area, thus ensuring the hit rate of Buffer Pool in response to normal business queries.

 

Content source: Lin Xiaobin "45 Lectures on MySQL Actual Combat"

 

 

Guess you like

Origin blog.csdn.net/qq_24436765/article/details/112846556