Mysql database query is so slow, besides the index, what else can it be?

I have been using ctrl c and ctrl v to develop curd code for many years.

Why is mysql query slow? This question is often encountered in actual development, and it is also a high-frequency question in interviews.

When encountering this kind of problem, we generally think that it is because of the index.

In addition to indexes, what other factors can cause database queries to slow down?

What are the operations that can improve the query capability of mysql?

In today's article, let's talk about the scenarios that cause database queries to slow down, and give reasons and solutions.

Database query process

Let's first take a look at what process a query statement will go through.

For example, we have a database table

CREATE TABLE `user` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT '主键',
  `name` varchar(100) NOT NULL DEFAULT '' COMMENT '名字',
  `age` int(11) NOT NULL DEFAULT '0' COMMENT '年龄',
  `gender` int(8) NOT NULL DEFAULT '0' COMMENT '性别',
  PRIMARY KEY (`id`),
  KEY `idx_age` (`age`),
  KEY `idx_gender` (`gender`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

The application code we usually write (go or C++) is called the client at this time .

The bottom layer of the client will try to establish a long TCP connection to MySQL with the account password.

The connection management module of mysql will manage this connection.

After the connection is established, the client executes a query sql statement. for example:

select * from user where gender = 1 and age = 100;

The client will connect the sql statement to mysql through the network.

After mysql receives the sql statement, it will first judge whether the SQL statement has grammatical errors in the analyzerl , such as select. If one less is typed and written slect, an error will be reported You have an error in your SQL syntax;. This error can be said to be very familiar to a handicapped party like me.

Next is the optimizer , where it will choose what index to use according to certain rules .

After that, the interface function of the storage engine is called through the executor .

[External link image transfer failed, the source site may have anti-leech mechanism, it is recommended to save the image and upload it directly (img-1bu3qtIz-1649810228829) (https://mmbiz.qpic.cn/mmbiz_png/AnAgeMhDIianbibkNQ7b3dWIYFDypVnpK30llfbC5U0KXGE0qjeVnLPkiclPicZGzc6G1MsGSrRbSaX “MysqlAbsql Architecture” )]

Mysql schema

The storage engine is similar to each component. They are where mysql actually obtains row-by-row data and returns the data. The storage engine can be replaced and changed, either with MyISAM, which does not support transactions, or Innodb, which supports transactions. This can be specified when creating the table. for example

CREATE TABLE `user` (
  ...
) ENGINE=InnoDB;

The most commonly used now is InnoDB .

Let's focus on this.

In InnoDB, because it is slow to operate the disk directly, a layer of memory is added to speed up the speed, called buffer pool . There, many memory pages are placed, each page is 16KB, and some memory pages are placed in the database table. That kind of line-by-line data, and some are index information.

[External link image transfer failed, the source site may have an anti-leech mechanism, it is recommended to save the image and upload it directly (img-mPGVDcTZ-1649810228830) (https://mmbiz.qpic.cn/mmbiz_png/AnAgeMhDIianbibkNQ7b3dWIYFDypVnpK3AicuSMsppX9uxkibXsNicTMzsOeTib1AZTMdmQpic.cn/mmbiz_png/AnAgeMhDIianbibkNQ7b3dWIYFDypVnpK3AicuSMsppX9uxkibXsNicTMzsOeTib1AZTMdmQpic.cn/mmbiz_png/AnAgeMhDIianbibkNQ7b3dWIYFDypVnpK3AicuSMsppX9uxkibXsNicTMzsOeTib1AZTMdmQPoolgIUgj640XuiavMb1AZTMdmQPoolgjbuff/disk640XuiavM ”)]

bufferPool and disk

Query SQL into InnoDB. According to the index calculated in the previous optimizer, the corresponding index page will be queried , and if it is not in the buffer pool, the index page will be loaded from the disk. Then speed up the query through the index page to get the specific location of the data page. If these data pages are not in the buffer pool, they are loaded from disk.

This way we get the row-by-row data we want.

[External link image transfer failed, the source site may have an anti-leech mechanism, it is recommended to save the image and upload it directly (img-d8FzorXD-1649810228830) (https://mmbiz.qpic.cn/mmbiz_png/AnAgeMhDIianbibkNQ7b3dWIYFDypVnpK3j17LhKgzqicwf/KNrQjPff37PPhlo4yf64zvhc2Y index and page “OYAot6hgwtyU0” Disk page relationship")]

The relationship between index pages and disk pages

Finally, the obtained data result is returned to the client.

Slow query analysis

If the above process is slow, we can profilingsee where the process is slow by turning it on.

mysql> set profiling=ON;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> show variables like 'profiling';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| profiling     | ON    |
+---------------+-------+
1 row in set (0.00 sec)

Then execute the sql statement normally.

The execution time of these SQL statements will be recorded. At this time, you want to check which statements have been recorded and can be executed.show profiles;

mysql> show profiles;
+----------+------------+---------------------------------------------------+
| Query_ID | Duration   | Query                                             |
+----------+------------+---------------------------------------------------+
|        1 | 0.06811025 | select * from user where age>=60                  |
|        2 | 0.00151375 | select * from user where gender = 2 and age = 80  |
|        3 | 0.00230425 | select * from user where gender = 2 and age = 60  |
|        4 | 0.00070400 | select * from user where gender = 2 and age = 100 |
|        5 | 0.07797650 | select * from user where age!=60                  |
+----------+------------+---------------------------------------------------+
5 rows in set, 1 warning (0.00 sec)

Pay attention to the above query_id, for example, select * from user where age>=60the corresponding query_id is 1. If you want to check the specific time-consuming of this SQL statement, you can execute the following command.

mysql> show profile for query 1;
+----------------------+----------+
| Status               | Duration |
+----------------------+----------+
| starting             | 0.000074 |
| checking permissions | 0.000010 |
| Opening tables       | 0.000034 |
| init                 | 0.000032 |
| System lock          | 0.000027 |
| optimizing           | 0.000020 |
| statistics           | 0.000058 |
| preparing            | 0.000018 |
| executing            | 0.000013 |
| Sending data         | 0.067701 |
| end                  | 0.000021 |
| query end            | 0.000015 |
| closing tables       | 0.000014 |
| freeing items        | 0.000047 |
| cleaning up          | 0.000027 |
+----------------------+----------+
15 rows in set, 1 warning (0.00 sec)

Through the above items, you can see where the specific time is spent. For example, it can be seen from the above that Sending data takes the most time. This refers to the time it takes for the executor to start querying data and sending the data to the client. Because my table has tens of thousands of eligible data , so this block The most time-consuming, but also in line with expectations.

Under normal circumstances, in our development process, most of the time is in the Sending datastage, and if it is slow in this stage, the most likely reason is index-related.

Index related reasons

Index-related problems can generally be analyzed with the explain command. It can see which indexes are used and how many rows will be scanned .

MySQL will see which index to choose in the optimizer stage , and the query speed will be faster.

Several factors are generally considered, such as:

  • How many rows to scan to select this index

  • How many 16kb pages need to be read in order to fetch these lines

  • The common index needs to be returned to the table, but the primary key index is not required. Is the return table cost much?

Back to the sql statement mentioned in show profile, let's use it to explain select * from user where age>=60analyze it.

[External link image transfer failed, the source site may have anti-leech mechanism, it is recommended to save the image and upload it directly (img-mbG9ch1m-1649810228831) (https://mmbiz.qpic.cn/mmbiz_png/AnAgeMhDIianbibkNQ7b3dWIYFDypVnpK384CyKs22B0E3ORxicOAnq0mFeFaCENC3HSo4HvwGnVdREZWCh9Gnexplain sql” )]

explain sql

The above statement uses typeALL, which means a full table scan , possible_keyswhich refers to the index that may be used. The index that may be used here is a common index built for age, but the index used by the database is actually keythere. One column, yes NULL. That is to say, this sql does not use the index, but a full table scan .

This is because there are too many eligible data rows ( rows) in the data table. If the age index is used, they need to be read out from the age index, and the age index is a common index , and it is necessary to go back to the table to find the corresponding primary key. Find the corresponding data page . After all, it is better to go directly to the primary key. So I finally chose a full table scan.

Of course, the above is just an example. In fact, when mysql executes sql, it often happens that no index is used or the index used does not meet our expectations . There are many scenarios of index failure, such as using an inequality sign, implicit conversion , etc., this I believe that you have memorized a lot when you recite the eight-legged essay, and I will not repeat it.

Let’s talk about two problems that are easy to encounter in production.

Index is not as expected

In actual development, some situations are quite special. For example, some database tables have a small amount of data and few indexes at the beginning. When executing sql, the indexes that meet your expectations are indeed used. But as time goes on, more people develop, and the amount of data also increases, and some other redundant indexes may even be added, and there may be other indexes that do not meet your expectations. . As a result, the query is suddenly slow.

This kind of problem can be solved easily by force indexspecifying the index . for example

[External link image transfer failed, the source site may have anti-leech mechanism, it is recommended to save the image and upload it directly (img-q68VFWKP-1649810228831) (https://mmbiz.qpic.cn/mmbiz_png/AnAgeMhDIianbibkNQ7b3dWIYFDypVnpK3hlnsL9hzg416jtwRJAlWhZEUZ5oM9tRRicizHiUweplg4PZTt index")]

force index specifies the index

It explaincan be seen that after adding the force index, sql selects the idx_age index.

The index is still very slow

Some sql, explainlooking at the command, is obviously going to the index, but it is still very slow. There are generally two situations:

The first is that the index discrimination is too low. For example, the url link of the full path of the web page is used for indexing. At a glance, all of them are the same domain name. If the length of the prefix index is not long enough, then the index is followed by the whole . Similar to table scan , the correct posture is to try to make the index more distinguishable . For example, remove the domain name and only use the latter part of the URI for indexing.

[External link image transfer failed, the source site may have an anti-leech mechanism, it is recommended to save the image and upload it directly (img-KWoHpJm6-1649810228832) (https://mmbiz.qpic.cn/mmbiz_png/AnAgeMhDIianbibkNQ7b3dWIYFDypVnpK3EzMC7fictT9d9s3uBMja0BsPrueIuOLmeagruypRS609GduFhOPRS609GduFu) too low")]

Index prefix discrimination is too low

The second is that the data matched in the index is too large. At this time, you need to pay attention to the rows field in explain.

It is used to estimate the number of rows that this query statement needs to check. It may not be completely accurate, but it can reflect an approximate order of magnitude.

When it is large, the following situations are generally common.

  • If this field has a unique attribute, such as phone number, etc., there should not be a lot of repetitions, it may be that your code logic has a lot of repeated insertion operations, you need to check the code logic, or you need to add a unique index limit Down.

  • If the data in this field is very large, do I need to get all of it? If not required, add a limitlimit. If you really want to get all of them, you can’t get them all at once. Today, you have a small amount of data, and you may not be under pressure to withdraw 10,000 or 20,000 yuan at a time. You may need to fetch in batches . The specific operation is to use order by idsorting first, and then take a batch of data 最大idas the starting position for the next fetching of data.

Too few connections

We have finished talking about the reasons related to indexing. Let's talk about what factors other than indexing will limit our query speed.

We can see that there is a connection management in the server layer of mysql , and its role is to manage the long connection between the client and mysql.

Under normal circumstances, if there is only one connection between the client and the server layer , after executing the sql query, it can only block and wait for the result to return. If there are a large number of concurrent requests for concurrent queries, then the subsequent requests need to wait for the execution of the previous request to complete . to start executing.

[External link image transfer failed, the source site may have anti-leech mechanism, it is recommended to save the image and upload it directly (img-HRKK41Rx-1649810228832) (https://mmbiz.qpic.cn/mmbiz_png/AnAgeMhDIianbibkNQ7b3dWIYFDypVnpK3c9WU7FyGe4jzn3caJWMOJPWOSPWGE6bp3TyV880gswXTlXtl” will cause sql to block")]

Too few connections will cause sql to block

Therefore, many times our applications, such as go or java, will print out the log of sql execution for a few minutes, but in fact, if you execute this statement separately, it is at the millisecond level. This is all because these sql statements are waiting for the previous sql execution to complete.

How to solve it?

If we can create a few more connections , then requests can be executed concurrently, and subsequent connections won't have to wait that long.

[External link image transfer failed, the source site may have an anti-leech mechanism, it is recommended to save the image and upload it directly (img-AtNj4cqO-1649810228832) (https://mmbiz.qpic.cn/mmbiz_png/AnAgeMhDIianbibkNQ7b3dWIYFDypVnpK36d77KUV53ibMkSkLg1b2El8qjdg0taeW2NiabIX6LVbw6/” Speed ​​up the execution of sql")]

Adding connections can speed up the execution of sql

The problem that the number of connections is too small is limited by both the database and the client .

The number of database connections is too small

The maximum number of connections of Mysql is by default 100, and the maximum can be reached 16384.

max_connectionsYou can change the maximum number of connections to the database by setting the parameters of mysql .

mysql> set global max_connections= 500;
Query OK, 0 rows affected (0.00 sec)

mysql> show variables like 'max_connections';
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| max_connections | 500   |
+-----------------+-------+
1 row in set (0.00 sec)

The above operation changed the maximum number of connections to 500.

The number of connections on the application side is too small

The database connection size has been adjusted, but it seems that the problem has not changed? Or are there a lot of sql executions that reach a few minutes, or even time out?

That may be because the number of connections on your application side (application written in go, java, that is, the client of mysql) is too small.

The connection between the application side and the bottom layer of mysql is a long link based on the TCP protocol , and the TCP protocol requires three handshakes and four waves to achieve connection establishment and release. If I re-establish a new connection every time I execute sql, it's time consuming to keep shaking hands and waving hands . Therefore, a long connection pool is generally established . After the connection is used up, it is stuffed into the connection pool. The next time you want to execute sql, you can fish a connection from it and use it, which is very environmentally friendly.

[External link image transfer failed, the source site may have anti-leech mechanism, it is recommended to save the image and upload it directly (img-veZIBdbg-1649810228833) (https://mmbiz.qpic.cn/mmbiz_png/AnAgeMhDIianbibkNQ7b3dWIYFDypVnpK3HLmE3tERpYBAzmzgpy93vn33lxPool connection principle "Pool connection principle" ”)]

Connection pool principle

When we generally write code, we will operate the database through third-party orm libraries , and mature orm libraries, 10 million percent of them will have a connection pool.

And this connection pool generally has a size. This size controls the maximum number of connections you can have. If your connection pool is too small, it is not as big as the database, then adjusting the maximum number of connections to the database will have no effect.

Under normal circumstances, you can look up the documentation of the orm library you are using to see how to set the size of the connection pool, just a few lines of code, just change it. For example, in the go language, gormit is set like this

func Init() {
  db, err := gorm.Open(mysql.Open(conn), config)
    sqlDB, err := db.DB()
    // SetMaxIdleConns 设置空闲连接池中连接的最大数量
    sqlDB.SetMaxIdleConns(200)
    // SetMaxOpenConns 设置打开数据库连接的最大数量
    sqlDB.SetMaxOpenConns(1000)
}

buffer pool too small

The number of connections has gone up and the speed has also increased.

I have encountered interviewers who will ask, is there any other way to make it faster?

That has to frown, pretend to think, and say: yes .

In the previous database query process, we mentioned that after entering innodb, there will be a layer of memory buffer pool, which is used to load disk data pages into memory pages. If you want to go to disk IO, it will be slow.

That is to say, if my buffer pool is bigger, then we can put more data pages. Correspondingly, the SQL query is more likely to hit the buffer pool, and the query speed is naturally faster.

The size of the buffer pool can be queried by the following command, the unit is Byte:

mysql> show global variables like 'innodb_buffer_pool_size';
+-------------------------+-----------+
| Variable_name           | Value     |
+-------------------------+-----------+
| innodb_buffer_pool_size | 134217728 |
+-------------------------+-----------+
1 row in set (0.01 sec)

That is 128Mb.

If you want to make it bigger. can be executed

mysql> set global innodb_buffer_pool_size = 536870912;
Query OK, 0 rows affected (0.01 sec)

mysql> show global variables like 'innodb_buffer_pool_size';
+-------------------------+-----------+
| Variable_name           | Value     |
+-------------------------+-----------+
| innodb_buffer_pool_size | 536870912 |
+-------------------------+-----------+
1 row in set (0.01 sec)

This increases the buffer pool to 512Mb.

However, if the buffer pool size is normal, but the query is slow due to other reasons , it is meaningless to change the buffer pool.

But here comes the problem.

How do you know if the buffer pool is too small?

This we can see the cache hit rate of the buffer pool .

[External link image transfer failed, the source site may have anti-leech mechanism, it is recommended to save the image and upload it directly (img-5e2XL17Y-1649810228833) (https://mmbiz.qpic.cn/mmbiz_png/AnAgeMhDIianbibkNQ7b3dWIYFDypVnpK398LGqItEAZKn636v7utoQ0pHkksTErGQqzfoxBIrwXxibLxSH view buffer pool27WNxBIrwXxibLxSH hit rate")]

View buffer pool hit rate

show status like 'Innodb_buffer_pool_%';You can see some information about the buffer pool through .

Innodb_buffer_pool_read_requestsIndicates the number of read requests.

Innodb_buffer_pool_readsIndicates the number of requests to read data from the physical disk.

So the hit rate of the buffer pool can be obtained like this:

buffer pool 命中率 = 1 - (Innodb_buffer_pool_reads/Innodb_buffer_pool_read_requests) * 100%

For example, in my screenshot above, 1 - (405/2278354) = 99.98%. It can be said that the hit rate is very high.

Under normal circumstances, the buffer pool hit rate is 99%above. If it is lower than this value, it is necessary to consider increasing the size of the innodb buffer pool.

Of course, this hit rate can also be monitored , so that the sql slows down in the middle of the night, and the cause can be located at work in the morning, which is very comfortable.

What other bullshit?

As mentioned above , a buffer pool is added to the storage engine layer to cache memory pages, which can speed up queries.

For the same reason, the server layer can also add a cache to directly cache the results of the first query, so that the next query can be returned immediately, which sounds beautiful.

It stands to reason that if the cache is hit, it can indeed speed up the query. However, this function is very limited. The biggest problem is that as long as the database table is updated, all the caches in the table will be invalidated . Frequent updates of the data table will bring frequent cache invalidation. So this function is only suitable for those data tables that are not updated very much.

In addition, this function 8.0版本was killed after that. So this function can be used for chatting, there is no need to actually use it in production.

[External link image transfer failed, the source site may have an anti-leech mechanism, it is recommended to save the image and upload it directly (img-5VblIOm1-1649810228833) (https://mmbiz.qpic.cn/mmbiz_png/AnAgeMhDIianbibkNQ7b3dWIYFDypVnpK3X3iag9tDib7wYs/kuOov5bX3sQWjBpZu1wGvahlIu8 delete")]

Query cache is deleted

Summarize

  • Too slow data query is generally an index problem. It may be because the wrong index is selected, or it may be because the number of rows queried is too many.

  • The number of client and database connections is too small, which will limit the number of concurrent SQL queries. Increasing the number of connections can improve the speed.

  • There will be a layer of memory buffer pool in innodb to improve the query speed. The hit rate is generally >99%. If it is lower than this value, you can consider increasing the size of the buffer pool, which can also improve the speed.

  • The query cache can indeed speed up the query, but it is generally not recommended to open it because the limit is relatively large, and this function has been eliminated in Mysql after 8.0.

At last

Recently, the reading volume of original updates has steadily declined, and after thinking about it, I tossed and turned at night.

I have an immature request.

picture

It's been a long time since I left Guangdong, and no one called me Pretty Boy for a long time.

Can you call me a pretty boy in the comment area ?

Can such a kind and simple wish of mine be fulfilled?

If you really can't speak out, can you help me click the like and watch in the lower right corner ?

Stop talking, let's choke in the ocean of knowledge together

Click on the business card below to follow the official account: [Xiaobai debug]

Xiaobai debug

Promise me, after paying attention, learn techniques well, don't just collect my emojis. .

31 original content

No public

Not satisfied with talking shit in the message area?

Add me, we have set up a group of paddling and bragging. In the group, you can chat about interesting topics with colleagues or interviewers you may encounter next time you change jobs. Just super! open! Heart!

picture

picture

Recommended articles:

Guess you like

Origin blog.csdn.net/ilini/article/details/124139560