MySQL single table with millions of data records paging performance optimization limit optimization

MySQL single table with millions of data records paging performance optimization limit optimization


On one of my own websites, the data access in a single table is as high as one million, resulting in very slow data access. The backend of Google Analytics often reports timeouts, especially for pages with large page numbers.

test environment:

Let's get acquainted with the basic sql statement first to see the basic information of the table we will test

use infomation_schema
SELECT * FROM TABLES WHERE TABLE_SCHEMA = ‘dbname’ AND TABLE_NAME = ‘product’

search result:

From the above figure we can see the basic information of the table:

Number of table rows: 866633
Average data length per row: 5133 bytes
Single table size: 4448700632 bytes

The unit of row and table size is in bytes. After calculation, we can know
the average row length: about 5k
. The total size of a single table: 4.1g
There are various types of fields in the table, such as varchar, datetime, text, etc. The id field is the primary key

test experiment

1. Directly use the limit start, count paging statement, which is also the method used in my program:

select * from product limit start, count
When the starting page is small, the query has no performance problems. Let's look at the execution time of paging starting from 10, 100, 1000, and 10000 respectively (20 entries per page), as follows:

select * from product limit 10, 20   0.016秒
select * from product limit 100, 20   0.016秒
select * from product limit 1000, 20   0.047秒
select * from product limit 10000, 20   0.094秒

We have seen that with the increase of the starting record, the time also increases, which shows that the paging statement limit has a great relationship with the starting page number, then we change the starting record to 40w (that is, the record generally around) select * from product limit 400000, 20 3.229 seconds

Look at the time when we take the last page of records
select * from product limit 866613, 20 37.44 seconds

It is no wonder that search engines often report timeouts when crawling our pages. Obviously, this kind
of time .

From this, we can also conclude two things:
1) The query time of the limit statement is proportional to the position of the starting record
2) The limit statement of mysql is very convenient, but it is not suitable for direct use for tables with many records.

2. Performance optimization method for limit paging problem

Use the covering index of the table to speed up the paging query
We all know that if only the index column (covering index) is included in the statement using the index query, then the query will be very fast.

Because there is an optimization algorithm for index search, and the data is on the query index, there is no need to find the relevant data address, which saves a lot of time. In addition, there is also a related index cache in Mysql, and it is better to use the cache when the concurrency is high.

在我们的例子中,我们知道id字段是主键,自然就包含了默认的主键索引。现在让我们看看利用覆盖索引的查询效果如何:

这次我们之间查询最后一页的数据(利用覆盖索引,只包含id列),如下:
select id from product limit 866613, 20 0.2秒
相对于查询了所有列的37.44秒,提升了大概100多倍的速度

那么如果我们也要查询所有列,有两种方法,一种是id>=的形式,另一种就是利用join,看下实际情况:

SELECT * FROM product WHERE ID > =(select id from product limit 866613, 1) limit 20
查询时间为0.2秒,简直是一个质的飞跃啊,哈哈

另一种写法
SELECT * FROM product a JOIN (select id from product limit 866613, 20) b ON a.ID = b.id
查询时间也很短,赞!

其实两者用的都是一个原理嘛,所以效果也差不多

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324857950&siteId=291194637