MySQL query performance optimization of MySQL pagination single table one million data records page performance optimization

MySQL limit paging query performance optimization

 

Mysql query paging is very simple, but when a large amount of data when it wears general paging.

Traditional paging query: SELECT c1, c2, cn ... FROM table LIMIT n, m

MySQL is to limit the working principle of the first n records in front of the reading, and then discard the first n, and m is read later want, so n, the greater the offset, the worse the performance.

Recommended paging query method:

1, as far as possible given the approximate range of query

  1. SELECT c1,c2,cn... FROM table WHERE id>=20000 LIMIT 10;

2, sub-query method

  1. SELECT c1,c2,cn... FROM table WHERE id>=
  2. (
  3. SELECT id FROM table LIMIT 20000,1
  4. )
  5. LIMIT 10;

3, a high-performance MySQL index method mentioned book read-only

Before optimization SQL:

  1. SELECT c1,c2,cn... FROM member ORDER BY last_active LIMIT 50,5

Optimized SQL:

  1. SELECT c1, c2, cn .. .
  2. FROM member
  3. INNER JOIN (SELECT member_id FROM member ORDER BY last_active LIMIT 50, 5)
  4. USING (member_id)

Difference is that, SQL before optimization need more I / O to waste, because the index to read, read data, and then discard the line without the need of. The SQL (subquery piece) optimized read-only index (Cover index) on it, and then read the columns needed by member_id.

4, a first step reading program ID, and then the method for reading a desired record IN

Program reads ID:

  1. SELECT id FROM table LIMIT 20000, 10;
  2. SELECT c1, c2, cn .. . FROM table WHERE id IN (id1, id2, idn.. .)

 

 

==============

 

 

MySQL's performance and limit the use of paging query analysis and optimization

A, limit usage

When we use the query, often to return to the previous few intermediate or a few lines of data, this time how to do it? Do not worry, mysql already provides such a function for us.

SELECT * FROM table LIMIT [offset,] rows | `rows OFFSET offset ` (LIMIT offset, `length`) SELECT * FROM table where condition1 = 0 and condition2 = 0 and condition3 = -1 and condition4 = -1 order by id asc LIMIT 2000 OFFSET 50000 

LIMIT clause may be used to force the SELECT statement returns the specified number of records. LIMIT takes one or two numeric parameters . The argument must be an integer constant. Given two parameters, the first parameter specifies a return to the first row record 偏移量, the second argument specifies the maximum number of rows returned . 初始记录行的偏移量是 0(而不是 1): For compatibility with PostgreSQL, MySQL also supports the syntax: LIMIT # OFFSET #.

mysql> SELECT * FROM table LIMIT 5,10; // 检索记录行 6-15 

// To retrieve all rows from one end to the offset record set, the second parameter may specify -1:

mysql> SELECT * FROM table LIMIT 95,-1; // 检索记录行 96-last. 

// given only one parameter, which represents the maximum number of rows returned:
mysql> SELECT * FROM table LIMIT 5; // retrieve the first rows 5
// In other words, LIMIT n equivalent to LIMIT 0,n .

Second, the performance of paging Mysql query analysis

MySql paging sql statement, if the TOP syntax and MSSQL as compared to MySQL's LIMIT syntax so elegant to look a lot. Use it to paging is the most natural thing.

The most basic way paging:

SELECT ... FROM ... WHERE ... ORDER BY ... LIMIT ... 

In the case of small amounts of data, such as SQL, sufficient, and the only problem is to make sure to note the use of the index:
For example, if the actual SQL statement like the following, then category_id, id on two composite index build better :

SELECT * FROM articles WHERE category_id = 123 ORDER BY id LIMIT 50, 10 

Pagination sub-query:

As the amount of data increases, more and more pages, SQL few pages after viewing it might look like:
the SELECT * the FROM Articles the WHERE category_id the ORDER BY the above mentioned id = 123 LIMIT 10000, 10

In a nutshell, that is the more backward LIMIT语句的偏移量就会越大,速度也会明显变慢paging .
At this point, we can improve paging efficiency by way of sub-queries, as follows:

SELECT * FROM articles WHERE id >= (SELECT id FROM articles WHERE category_id = 123 ORDER BY id LIMIT 10000, 1) LIMIT 10 

JOIN pagination

SELECT * FROM `content` AS t1 JOIN (SELECT id FROM `content` ORDER BY id desc LIMIT ".($page-1)*$pagesize.", 1) AS t2 WHERE t1.id <= t2.id ORDER BY t1.id desc LIMIT $pagesize; 

After my test, join paged and paged sub query efficiency on a basic level, the consumption of time are basically the same.
explain SQL statement:

id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY <derived2> system NULL NULL NULL NULL 1 1 PRIMARY t1 range PRIMARY PRIMARY 4 NULL 6264 Using where 2 DERIVED content index NULL PRIMARY 4 NULL 27085 Using index 

Why is this so? Because the sub-query is done on the index, while a general inquiry done on the data file, generally speaking, the index file is much smaller than the data file, so the operation will be more efficient.

The actual pattern can use a similar strategy to deal with the way paging, such as to determine if it is less than one hundred pages, use the most basic way paging, greater than one hundred, use pagination sub-queries.

Third, for the mysql table has a large amount of data, the use of the presence of a very serious performance problems LIMIT page.

Query from the record after the first 30 1000000:

SQL代码1:平均用时6.6秒 SELECT * FROM `cdb_posts` ORDER BY pid LIMIT 1000000 , 30 SQL代码2:平均用时0.6秒 SELECT * FROM `cdb_posts` WHERE pid >= (SELECT pid FROM `cdb_posts` ORDER BY pid LIMIT 1000000 , 1) LIMIT 30 

As to remove all the contents of the field , large amounts of data need to cross the first block and removed, while the second substantially directly 根据索引字段定位后,才取出相应内容, greatly enhance the efficiency of natural. The optimization limit, the limit is not used directly, but first get to id offset, and then used directly limit size to obtain the data.

As can be seen, the next page, the offset will be larger LIMIT statement, both the speed gap will be more obvious.

Practical applications, you can use a similar strategy to deal with the way paging mode, for example, to determine if it is less than one hundred pages, use the most basic way paging, greater than one hundred, use pagination sub-queries.

Optimization idea:避免数据量大时扫描过多的记录

In order to ensure continuous index column index, each table can be added to a self-energizing field, and adding index

Reference: mysql paging offset is too large, Sql optimization experience

 

 

========

 

 

MySQL single table one million data records page performance optimization

background:

Own a website, since the data recorded up to a single table one million, resulting in very slow data access, analysis, background Google regular report timeout, especially large pages page is not slow.

test environment:

Let us be familiar with basic sql statement, the next we are going to see a test of basic information table

use infomation_schema
SELECT * FROM TABLES WHERE TABLE_SCHEMA = ‘dbname’ AND TABLE_NAME = ‘product’

search result:

From the figure above we can see the basic information table:

Table rows: 866 633
average data length of each line: 5133 bytes
single table Size: 4,448,700,632 bytes

About the size of the table rows and the unit is bytes, we know that after calculating
the average line length: about 5k
single table Total size: 4.1g
types table field has varchar, datetime, text, etc., id primary key field

Test test

1. Direct with limit start, count page statement, which I used in the method of the program:

select * from product limit start, count
as the start page is small, there is no query performance issues, we look respectively from 10, 100, 1000, 10000 execution time to start paging (page take 20), as follows:

select * from product limit 10, 20   0.016秒
select * from product limit 100, 20   0.016秒
select * from product limit 1000, 20   0.047秒
select * from product limit 10000, 20   0.094秒

We have seen that with the increase of the initial recording, time also with the increase, indicating pagination statement limit with the starting page number is a great relationship, then we start recording the change 40w facie (ie record generally about) select * from product limit 400000, 20 3.229 seconds

We take a look at the last recorded time
select * from product limit 866613, 20 37.44 Miao

No wonder the search engines crawl the page we often report a timeout, as this page's largest PAGE Obviously, this time
between is intolerable.

From which we can summarize two things:
1) query time limit statements to the position of the start of the recording is proportional to
2) mysql the limit statement is very convenient, but many of the records of the table are not suitable for direct use.

2. Performance optimization problem to limit pagination

Covering the use of the index table to speed up query paging
we all know, the use of the index query if the statement contains only the index column (covering indexes), then this situation will soon queries.

Because the use of the index Finding optimization algorithms and index data in the query above, do not have to go to address the relevant data, this saves a lot of time. In addition Mysql is also related to the index cache, at a time of high concurrency better use of caching effects.

In our example, we know that the id field is the primary key, naturally contains the default primary key index. Now let's see how to use a covering index query results:

The last page of the query data (covering the use of the index contains only the id column), as follows among us:
the SELECT id from Product limit 866 613, 20 0.2 Miao
relative to query all the columns of 37.44 seconds to improve by about 100 times speed

So if we have to query all the columns, there are two ways, one is the id> = form, and the other is to use join, look at the actual situation:

SELECT * FROM product WHERE ID> = (select id from product limit 866613, 1) limit 20
query time is 0.2 seconds, and is simply a qualitative leap ah, ha

Another way
SELECT * FROM product a JOIN (select id from product limit 866613, 20) b ON a.ID = b.id
query time is very short, like!

In fact both a principle with the place, so the effect is almost

background:

Own a website, since the data recorded up to a single table one million, resulting in very slow data access, analysis, background Google regular report timeout, especially large pages page is not slow.

test environment:

Let us be familiar with basic sql statement, the next we are going to see a test of basic information table

use infomation_schema
SELECT * FROM TABLES WHERE TABLE_SCHEMA = ‘dbname’ AND TABLE_NAME = ‘product’

search result:

From the figure above we can see the basic information table:

Table rows: 866 633
average data length of each line: 5133 bytes
single table Size: 4,448,700,632 bytes

About the size of the table rows and the unit is bytes, we know that after calculating
the average line length: about 5k
single table Total size: 4.1g
types table field has varchar, datetime, text, etc., id primary key field

Test test

1. Direct with limit start, count page statement, which I used in the method of the program:

select * from product limit start, count
as the start page is small, there is no query performance issues, we look respectively from 10, 100, 1000, 10000 execution time to start paging (page take 20), as follows:

select * from product limit 10, 20   0.016秒
select * from product limit 100, 20   0.016秒
select * from product limit 1000, 20   0.047秒
select * from product limit 10000, 20   0.094秒

We have seen that with the increase of the initial recording, time also with the increase, indicating pagination statement limit with the starting page number is a great relationship, then we start recording the change 40w facie (ie record generally about) select * from product limit 400000, 20 3.229 seconds

We take a look at the last recorded time
select * from product limit 866613, 20 37.44 Miao

No wonder the search engines crawl the page we often report a timeout, as this page's largest PAGE Obviously, this time
between is intolerable.

From which we can summarize two things:
1) query time limit statements to the position of the start of the recording is proportional to
2) mysql the limit statement is very convenient, but many of the records of the table are not suitable for direct use.

2. Performance optimization problem to limit pagination

Covering the use of the index table to speed up query paging
we all know, the use of the index query if the statement contains only the index column (covering indexes), then this situation will soon queries.

Because the use of the index Finding optimization algorithms and index data in the query above, do not have to go to address the relevant data, this saves a lot of time. In addition Mysql is also related to the index cache, at a time of high concurrency better use of caching effects.

In our example, we know that the id field is the primary key, naturally contains the default primary key index. Now let's see how to use a covering index query results:

The last page of the query data (covering the use of the index contains only the id column), as follows among us:
the SELECT id from Product limit 866 613, 20 0.2 Miao
relative to query all the columns of 37.44 seconds to improve by about 100 times speed

So if we have to query all the columns, there are two ways, one is the id> = form, and the other is to use join, look at the actual situation:

SELECT * FROM product WHERE ID> = (select id from product limit 866613, 1) limit 20
query time is 0.2 seconds, and is simply a qualitative leap ah, ha

Another way
SELECT * FROM product a JOIN (select id from product limit 866613, 20) b ON a.ID = b.id
query time is very short, like!

In fact both a principle with the place, so the effect is almost

Guess you like

Origin www.cnblogs.com/xiami2046/p/12630374.html