Large amount of data, paging query is slow, how to optimize?

When the need for tens of thousands of records from a database query table, all one-time query results will become very slow, especially with the increasing amount of data is particularly evident, then you need to use the paging query. For a database query page, there are many methods and optimization points.

The following briefly about some of the ways I know of.

Ready to work

In order to optimize the number listed below for testing, will be described below has a table.

  • 表名:order_history

  • Description: Orders history table of a business

  • The main fields: unsigned int id, tinyint (4) int type

  • Field: The total of 37 fields of the table, and other large text data does not include, up to varchar (500), id field is an index, and is incremented.

  • Data volume: 5,709,294

  • MySQL Version: 5.7.16 looking at a one million line of the test table is not easy, if you need to test yourself, you can write shell scripts to test what data is inserted. All the following sql statement execution environment did not change, the following are the basic test results:

     

select count(*) from orders_history;

Return: 5709294

Three queries time were:

  • 8903 ms

  • 8323 ms

  • 8401 ms

General paging query

General paging queries using a simple limit clause can be achieved. limit clause following statement:

SELECT * FROM table LIMIT [offset,] rows | rows OFFSET offset

LIMIT clause may be used to specify the number of records returned by the SELECT statement. Note the following:

  • The first argument specifies the first returns the offset rows of attention from the  0start

  • The second parameter specifies the maximum number of rows returned

  • If given only one argument: It represents the maximum number of rows returned

  • The second parameter to -1 to retrieve an offset from a set of records to the end of all rows

  • Offset 0 is the initial rows (instead of 1)

The following is an application example:

select * from orders_history where type=8 limit 1000,10;

The article will query statement from the table orders_history  offset:100010 after the start of the data, that is, Article 1001 to Article 1010 data ( 1001<=id<=1010).

Data table records using the master key (usually id) sort results corresponding to the above default:

select * from orders_history where type=8 order by id limit 10000,10;

Three queries time were:

  • 3040 ms

  • 3063 ms

  • 3018 ms

In response to this query, the following test query impact record amount of time:

  1.  
    select * from orders_history where type=8 limit 10000,1;
  2.  
    select * from orders_history where type=8 limit 10000,10;
  3.  
    select * from orders_history where type=8 limit 10000,100;
  4.  
    select * from orders_history where type=8 limit 10000,1000;
  5.  
    select * from orders_history where type=8 limit 10000,10000;

Three queries hours are as follows:

  • Query a record: 3072ms 3092ms 3002ms

  • Query 10 Record: 3081ms 3077ms 3032ms

  • Queries 100 records: 3118ms 3200ms 3128ms

  • Query 1000 record: 3412ms 3468ms 3394ms

  • Query 10000 records: 3749ms 3802ms 3696ms

I also did a dozen times a query, the query time from the point of view, is almost certain, at the time of the query record amount of less than 100, query time basically no difference, as the inquiry record amount of time more and more, it takes too more and more.

For inquiries offset test:

  1.  
    select * from orders_history where type=8 limit 100,100;
  2.  
    select * from orders_history where type=8 limit 1000,100;
  3.  
    select * from orders_history where type=8 limit 10000,100;
  4.  
    select * from orders_history where type=8 limit 100000,100;
  5.  
    select * from orders_history where type=8 limit 1000000,100;

Three queries hours are as follows:

  • Query 100 Offset: 25ms 24ms 24ms

  • Query 1000 offset: 78ms 76ms 77ms

  • Query 10000 Offset: 3092ms 3212ms 3128ms

  • Queries 100 000 Offset: 3878ms 3812ms 3798ms

  • Query 1 million offset: 14608ms 14062ms 14700ms

After the offset with an increase in queries, especially queries offset is greater than 100,000, a sharp increase in query time.

This paging query will start scanning the first record from the database, so the more backward, the slower the query speed, but the more data query, the query will slow down the overall speed.

The use of sub-query optimization

In this way the first positioning id offset position, and then later inquiries, this method is applicable to the case of id increments.

  1.  
    select * from orders_history where type=8 limit 100000,1;
  2.  
    select id from orders_history where type=8 limit 100000,1;
  3.  
    select * from orders_history where type=8 and
  4.  
    id>=(select id from orders_history where type=8 limit 100000,1)
  5.  
    limit 100;
  6.  
    select * from orders_history where type=8 limit 100000,100;

Query time 4 statement is as follows:

  • Article 1 of the statement: 3674ms

  • Article 2 statement: 1315ms

  • Article 3 of the statement: 1327ms

  • Article 4 statement: 3710ms

Note that for the above query:

  • Comparison of the first statement and the second statement: select id instead of using select * speed increased 3 times

  • Comparative Article 2 and 3 statement Statement: speed difference of tens of milliseconds

  • Comparison of Article 3 and Article 4 sentence statement: select id benefit from increased speed, Article 3 statement query speed increased by 3 times

In this way compared to the original general query methods will be several times faster. Recommended reading: Professional Solutions MySQL query slow and poor performance!

Id defined using optimization

This assumes that the data table id is continuously increasing, then we can calculate the id of the scope of the query based on the number of records the number of pages and queries queries, you can use the id between and to query:

  1.  
    select * from orders_history where type=2
  2.  
    and id between 1000000 and 1000100 limit 100;

Query time: 15ms 12ms 9ms

This query can greatly optimize the search speed can be substantially completed within a few tens of milliseconds. Use only restriction is clearly aware of the situation in the id, but most of the time table creation, will add basic id field, which brought a lot of convenience for paging query.

There may also be another kind of writing:

select * from orders_history where id >= 1000001 limit 100;

Of course, also can be used in a way to query, this approach is often used when the associated multi-table query, use the other set of id-table query to query:

  1.  
    select * from orders_history where id in
  2.  
    ( select order_id from trade_2 where goods = 'pen')
  3.  
    limit 100;

In this way the query to note: some mysql version does not support the use of limit in the IN clause.

Optimize the use of temporary table

In this way it does not belong to query optimization, here comes the way.

For questions id defined optimization, you need id is continuously increasing, but in some scenarios, such as the use history table when, or appeared when missing data problem, consider using temporary storage table to record id paging, id to use paging in a query. This can greatly improve the traditional paging query speed, especially when the amount of tens of millions of data.

id description of the data table

Under normal circumstances, the establishment of the table in the database when forced to add the id field is incremented each table, so to facilitate the inquiry.

If the amount of data such as orders and other very large database, will be generally sub-library sub-table. This time is not recommended as a unique identifier id database, but should use high concurrency of distributed generator to generate a unique id, and the use of additional fields in the data table to store the unique identifier.

Use first use range queries positioning id (or index), and then use the index to locate data several times can improve query speed. That is, first select id, then select *;

Reference from: https: //blog.csdn.net/youanyyou/article/details/95558514

 

Guess you like

Origin www.cnblogs.com/fengzifengfeng/p/11531969.html