Single table MySQL] [one hundred million paging query test under the amount of data

table of Contents

1, create test data

Table 1.1 build

1.2 Insert the test data

2, general paging queries: limit

About 2.1 limit

2.2 Examples

2.3 Test

1 record test queries affect the amount of time

2 test queries offset the impact of time

3, the use of sub-query optimization limit

4, id defined optimization

5, optimizing the use of temporary tables

id description of the data table


1, create test data

Table 1.1 build

CREATE TABLE `t_user` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `c_user_id` varchar(36) NOT NULL DEFAULT '' COMMENT '用户Id',
  `c_name` varchar(22) NOT NULL DEFAULT '' COMMENT '用户名',
  `c_province_id` int(11) NOT NULL COMMENT '省份Id',
  `c_city_id` int(11) NOT NULL COMMENT '城市Id',
  `create_time` datetime NOT NULL COMMENT '创建时间',
  PRIMARY KEY (`id`),
  KEY `idx_user_id` (`c_user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=100000001 DEFAULT CHARSET=utf8mb4;

1.2 Insert the test data

2, general paging queries: limit

About 2.1 limit

General paging queries using a simple limit clause can be achieved. limit clause following statement:

SELECT *
FROM table 
LIMIT [offset,] rows | rows OFFSET offset;

LIMIT clause may be used to specify the number of records returned by the SELECT statement. Note the following:

  • The first argument specifies the first returns the offset rows of attention from  0the start;
  • The second parameter specifies the maximum number of rows of return;
  • Given only one parameter: indicates that returns the maximum number of rows;
  • The second parameter to -1 retrieve all rows offset from one end to the set record;
  • Offset 0 is the initial rows (instead of 1);

2.2 Examples

select * from t_user limit 1000, 10;

The article will query statement from the table t_user offset:100010 after the start of the data, that is, Article 1001 to Article 1010 data ( 1001<=id<=1010).

Data table records using the master key (usually id) sort results corresponding to the above default:

select * from t_user order by limit 1000, 10;

2.3 Test

1 record test queries affect the amount of time

No shots, and directly write the results of it:

测试语句                                        三次执行分别耗时(ms)
select * from t_user limit 1000, 1;             15 -- 15 -- 14
select * from t_user limit 1000, 10;            14 -- 15 -- 15
select * from t_user limit 1000, 100;           15 -- 15 -- 15
select * from t_user limit 1000, 1000;          17 -- 16 -- 16
select * from t_user limit 1000, 10000;         168 -- 158 -- 158
select * from t_user limit 1000, 100000;        1159 -- 1559 -- 1479

Follow-up and tried several, query time from the point of view, is almost certain, at the time of the query is lower than the 1000 record volume, query time gap is not, as the inquiry record amount increases, the time it takes will be increasingly more.

2 test queries offset the impact of time

测试语句                                      三次分别耗时
select * from t_user limit 100, 100;          16 -- 16 -- 15
select * from t_user limit 1000, 100;         15 -- 17 -- 14
select * from t_user limit 10000, 100;        18 -- 17 -- 17
select * from t_user limit 100000, 100;       42 -- 42 -- 42
select * from t_user limit 1000000, 100;      761 -- 727 -- 762

After the offset with an increase in queries, especially queries offset is greater than 100,000, a sharp increase in query time.

This paging query will start scanning the first record from the database, so the more backward, the slower the query speed, but the more data query, the query will slow down the overall speed.

3, the use of sub-query optimization limit

(1)select * from t_user limit 100000, 1;

(2)select id from t_user limit 100000, 1;

(3)select * from t_user where id >= (select id from t_user limit 100000, 1) limit 100;

(4)select * from t_user limit 100000, 100;

More than four statements of three tests query time were:

(1) 43 -- 43 -- 41
(2) 27 -- 26 -- 25
(3) 27 -- 27 -- 26
(4) 44 -- 41 -- 43

Note that for the above query:

  • Comparison of the first statement and the second statement: select id instead of using select *, query time by

  • Compare 2 statements and statements Article 3: Query speed or less

  • Comparison of Article 3 and Article 4 sentence statement: select id benefit from increased speed, Article 3 of the statement also increased query speed

In this way compared to the general limit of query method, it will accelerate the speed of data query.

4, id defined optimization

This assumes that the data table id is continuously increasing , then we can calculate the id of the scope of the query based on the number of records the number of pages and queries queries, you can use the id between and to query:

测试语句                                                                        耗时(ms)
(优化)select * from t_user where id between 100000 and 1000100 limit 100;     14 -- 14 -- 14

(原始)select * from t_user limit 100000, 100;                                 42 -- 41 -- 41

为了对比,将修改数据:
(优化)select * from t_user where id between 10000000 and 100000100 limit 100; 16 -- 13 -- 14

(原始)select * from t_user limit 10000000, 100;                            4605 --4632--4622

This query can greatly optimize the search speed can be substantially completed within a few tens of milliseconds. Use only restriction is clearly aware of the situation in the id, but most of the time table creation, will add basic id field, which brought a lot of convenience for paging query.

There is another way to write:

select * from t_user where id >= 10000001 limit 100;

Of course, also can be used in a way to query, this approach is often used when the associated multi-table query, use the other set of id-table query to query:

eg:select * from t_user where id in (select id from t_user2 where city='Beijing') limit 100;

In this way the query to note: some mysql version does not support the use of limit in the IN clause.

5, optimizing the use of temporary tables

In this way it does not belong to query optimization, here comes the way.

For questions id defined optimization, you need id is continuously increasing, but in some scenarios, such as the use history table when, or appeared when missing data problem, consider using temporary storage table to record id paging, id to use paging in a query. This can greatly improve the traditional paging query speed, especially when the amount of tens of millions of data. (This needs to be tested)

6, description on the data table id

Under normal circumstances, the establishment of the table in the database when forced to add the id field is incremented each table, so to facilitate the inquiry.

If the amount of data such as orders and other very large database, will be generally sub-library sub-table. This time is not recommended as a unique identifier id database, but should use high concurrency of distributed generator to generate a unique id, and the use of additional fields in the data table to store the unique identifier.

Use first use range queries positioning id (or index), and then use the index to locate data several times can improve query speed. That is, first select id, then select *;

 

reference:

https://mp.weixin.qq.com/s/nW3KQ3DoumlXHvuHe1WczQ

发布了95 篇原创文章 · 获赞 16 · 访问量 5万+

Guess you like

Origin blog.csdn.net/tiankong_12345/article/details/99670241