mysql paging optimization



How high can MySQL's million-level paging optimization (Mysql million-level fast paging) be?

Share my experience below
I encountered this problem during the interview, and I searched online to find how to optimize it. This article is well written, so I shared it with bloggers.

Generally, when you first start learning SQL, you will write like this
Copy the code The code is as follows:

SELECT * FROM table ORDER BY id LIMIT 1000, 10;

But when the data reaches millions, writing like this will slow down.
Copy the code The code is as follows:

SELECT * FROM table ORDER BY id LIMIT 1000000, 10;

It may take dozens of seconds. Many optimization methods on the

Internet are like this
Copy the code The code is as follows:

SELECT * FROM table WHERE id >= (SELECT id FROM table LIMIT 1000000, 1) LIMIT 10;

Yes, the speed has increased to 0.x seconds, and it seems to be OK.
However , it is not perfect!

The following sentence is perfect!
Copy the code The code is as follows:

SELECT * FROM table WHERE id BETWEEN 1000000 AND 1000010;

It is 5 to 10 times faster than the above sentence.

In addition , if you need to query the id that is not a continuous segment, the best way is to find the id first, and then use the in query
Copy the code The code is as follows:

SELECT * FROM table WHERE id IN(10000, 100000, 1000000...);


Another point to share: When
querying a field with a long string, add another field to the field when designing the table. For example,
when querying the field that stores the URL, do not query the string directly, which is inefficient. You should check the string. How does crc32 or md5

optimize Mysql tens of millions of fast paging

Limit 1,111 There are indeed some performance problems when the data is large, and where id >= XX is used through various methods, so that the id number of the index may be used faster. . By:jack
Mysql limit The solution to slow paging (Mysql limit optimization, fast paging for millions to tens of millions of records)

How high can MySql performance be? I have used php for more than half a year, and it is still from the day before yesterday to really think about this issue so deeply. There has been pain and despair, but now I am full of confidence! MySql is a database that is definitely suitable for dba-level masters to play. Generally, a small system with 10,000 pieces of news can be written how to write it, and the xx framework can be used to achieve rapid development. But when the amount of data reaches 100,000, from one million to ten million, can his performance still be that high? A little mistake may cause the entire system to be rewritten, or even make the system unable to run normally! Well, not so much nonsense. Speak with facts, see an example: the
data table collect ( id, title , info , vtype) has these 4 fields, of which title is fixed length, info is text, id is gradual, vtype is tinyint, and vtype is index. This is a simple model of a basic news system. Now fill it with data, fill it with 100,000 news.
The final collect is 100,000 records, and the database table occupies 1.6G of the hard disk. OK, look at the following sql statement:

select id, title from collect limit 1000, 10; very quickly; basically 0.01 seconds is OK, then look at the following
select id, title from collect limit 90000, 10; paging from 90,000, the result?
8-9 seconds to complete, what's wrong with my god? ? ? ? In fact, to optimize this data, the answer can be found online. Look at the following statement:
select id from collect order by id limit 90000,10; Very quickly, 0.04 seconds is OK. Why? Because using the id primary key as an index is of course faster. The online modification is:
select id, title from collect where id>=(select id from collect order by id limit 90000,1) limit 10;
this is the result of using id as an index. But the problem is a little more complicated, and it's over. Look at the following statement
select id from collect where vtype=1 order by id limit 90000,10; Very slow, took 8-9 seconds!
When I get here, I believe that many people will have the same feeling of collapse as me! Is vtype indexed? How can it be slow? It is good for vtype to be indexed. You can directly select id from collect where vtype=1 limit 1000,10; It is very fast, basically 0.05 seconds, but it is 90 times faster. Starting from 90,000, that is 0.05*90=4.5 seconds speed. And test results 8-9 seconds to an order of magnitude. From here, someone proposed the idea of ​​​​dividing tables, which is the same idea as the dis #cuz forum. The idea is as follows:
Build an index table: t (id, title, vtype) and set it to a fixed length, then do paging, paging out the results and then go to collect to find info. Is it feasible? Find out by experimenting.
100,000 records are stored in t(id, title, vtype), and the size of the data table is about 20M. Using
select id from t where vtype=1 order by id limit 90000,10; is very fast. Basically 0.1-0.2 seconds can run. Why is this so? I guess because there is so much data to collect, pagination goes a long way. The limit is completely related to the size of the data table. In fact, this is still a full table scan, just because the amount of data is small, only 100,000 is fast. OK, let's do a crazy experiment, add it to 1 million, and test the performance.
After adding 10 times the data, the t-table will reach more than 200 M immediately, and it is a fixed length. Or the query statement just now, the time is 0.1-0.2 seconds to complete! Is the performance of the split meter okay? wrong! Because our limit is still 90,000, so fast. Give a big one, 900,000 start
select id from t where vtype=1 order by id limit 900000,10; Look at the result, the time is 1-2 seconds!
Why ?? It takes so long to divide the table, very depressing! Some people say that fixed length will improve the performance of limit. At first, I thought that because the length of a record is fixed, mysql should be able to calculate the position of 900,000, right? But we have overestimated the intelligence of mysql. It is not a business database. It turns out that fixed-length and non-fixed-length have little effect on limit? No wonder some people say that discuz will be very slow when it reaches 1 million records, I believe this is true, this is related to database design!
Couldn't MySQL break the 1 million limit? ? ? Does it really reach the limit when it reaches 1 million pagination? ? ?
The answer is: NO!!!! The reason why it can't break through 1 million is because mysql is not designed. Let's introduce the non-sub-table method, come to a crazy test! One table to get 1 million records, and 10G database, how to quickly paginate!
Well, our test went back to the collect table, and the test conclusion was: 300,000 data, it is feasible to use the sub-table method, and the speed of more than 300,000 will be slow and you can't stand it! Of course, if the method of sub-table + me is used, it is absolutely perfect. But after using my method, it can be solved perfectly without dividing the table!
The answer is: compound indexes! Once when designing a mysql index, I accidentally found that the index name can be chosen arbitrarily, and several fields can be selected to come in. What is the use of this? The first select id from collect order by id limit 90000,10; is so fast because the index is gone, but if the where is added, the index will not be taken. With the idea of ​​trying it out, I added an index like search(vtype,id) . Then test
select id from collect where vtype=1 limit 90000,10; very fast! Done in 0.04 seconds!
Test again: select id ,title from collect where vtype=1 limit 90000,10; Very sorry, 8-9 seconds, did not go to the search index!
Retest: search(id, vtype), or select id, it is also a pity, 0.5 seconds.
To sum up: if there is a where condition, and you want to use the limit for the index, you must design an index, put the where in the first place, the primary key used by the limit in the second place, and only the primary key can be selected!
Perfectly solved the pagination problem. If you can quickly return the id, you can hope to optimize the limit. According to this logic, the limit of millions should be divided in 0.0x seconds. It seems that the optimization and indexing of mysql statements are very important!
Well, back to the original topic, how to successfully and quickly apply the above research to development? With compound queries, my lightweight framework is useless. You have to write the paging string yourself, how troublesome is that? Here is another example, the idea comes out:
select * from collect where id in (9000,12,50,7000); It can be checked in 0 seconds!
Mygod, the index of mysql is also effective for the in statement! It seems that the online saying that in cannot use the index is wrong!
With this conclusion, it can be easily applied to lightweight frameworks: the

code is as follows:
Copy the code The code is as follows:

$db=dblink();
$db->pagesize=20;
$sql=”select id from collect where vtype=$vtype”;
$db->execute($sql);
$strpage=$db->strpage() ; //Save the paging string in a temporary variable for easy output
while($rs=$db->fetch_array()){
$strid.=$rs['id'].',';
}
$strid=substr( $strid,0,strlen($strid)-1); //Construct the id string
$db->pagesize=0; //It is very important to clear the paging without logging out the class, so you only need to use One database connection, no need to open again;
$db->execute(“select id,title,url,sTime,gTime,vtype,tag from collect where id in ($strid)”);
<?php while($rs= $db->fetch_array()): ?>
<tr>
<td> <?php echo $rs['id'];?></td>
<td> <?php echo $rs['url']; ?>< /td>
<td> <?php echo $rs['sTime'];?></td>
<td> <?php echo $rs['gTime'];?></td>
<td> <?php echo $rs['vtype'];?></td>
<td> <a href=”?act=show&id=<?php echo $rs['id'];?>” target=”_blank”><?php echo $rs['title'];?></a></td>
<td> <?php echo $rs['tag'];?></td>
</tr>
<?php endwhile; ?>
</table>
<?php
echo $strpage;

Through simple transformation, the idea is actually very simple: 1) By optimizing the index, find out the id, and spell it into a string such as "123,90000,12000". 2) 2nd query to find out the result.
A small index + a little change enables mysql to support millions or even tens of millions of efficient paging!
Through the examples here, I have reflected: For large systems, PHP must not use frameworks, especially those frameworks that cannot even see SQL statements! Because the beginning of my lightweight framework almost collapsed! It is only suitable for the rapid development of small applications. For ERP, OA, large websites, and the data layer including the logic layer, frameworks cannot be used. If the programmer loses control of the SQL statement, the risk of the project will increase exponentially! Especially when using mysql, mysql must need a professional dba to give full play to its best performance. The performance difference caused by one index can be thousands of times!
PS: After the actual test, to 1 million data, 1.6 million data, 15G table, 190M index, even if the index is used, the limit is 0.49 seconds. So paging is best not to let others

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325899236&siteId=291194637