The paging optimization of large tables with large data volume in mysql is sorted according to the time field, which has duplicate values and is inconsistent with the primary key id sorting.

When there are hundreds of thousands of entries, there will be problems with the limit statement paging query. It is okay to query the higher data, but if the lower data appears, it will become very slow. 

Requirement: There is a large table with millions of rows, which needs to be read in pages according to the order of its time field. The order of the field is not consistent with the order of the primary key id. And the field will be repeated. For large tables, the online method of pagination is based on the primary key The id is paged in order, so the incremental feature of the primary key id can be used to achieve the effect: as demonstrated below:

select * from table_name order by id asc limit 500000,10; //普通分页语句

//转换后的, 该语句查询非常快
select * from table_name where id > (select id from table_name order by id asc limit 500000,10) limit 10

   However, in this requirement, it is obvious that this feature cannot be used because of its repeated value. 

 

solution: 

  1. First of all, it must first add an index to the time field (assuming that the field is named create_time). After adding the index, the limit 0, 10 query is very fast, but the limit 500000, 10 query is lower The data query is horribly slow, usually taking several seconds. The explain query plan can see that the limit 50000.10 statement can be seen before MySQL can read the data after scanning the first 500,000 rows, so this is very slow.
  2. Continue to optimize on step 1. 
    //对于几十万表基本都能在1s内查出来
    select * from table_name where id in (SELECT id from (select id from table_name LIMIT 500000,10) as tmp)  
    
    // 1.其中使用(SELECT id from (select id from table_name LIMIT 500000,10) as tmp)
         这种繁琐的写法是为了适应mysql5.6不支持子查询语句中有limit操作.当然你也可以将上述语句拆分成
         两条,先执行子查询获取id结果集, 然后在用id in id结果集语句获取最终结果也可.
    // 2.该句子主要时间花费在(SELECT id from (select id from table_name LIMIT 500000,10) as tmp) 
         子查询上, 而 id in的查询是非常快的,自己测试只要零点零几秒. 
    
    // 3.最后要强调的是以上例子写法比较简化,查询只取出自己需要的列,不要列就不要取出,因为取出不需要的列
         时,传送数据的时间会加大以及其他的一些开销,尤其是无用的列很多很大时. 所以select *写法要根据你
         的业务需要调整,有时候这个方面的耗时还是很大的,一定引起重视.

     

   The icing on the cake solution (using mysql's own query cache, which can only be cached for certain statements)

    Conditions of use : (First of all, if you are already using mysql query cache, then this part can be ignored) If the query statement you want to optimize is basically the same, such as some public data (such as the home page) of your application, then these You only need to query the cache again, and then the same query can be used directly, which is very fast. It happens that mysql provides the function of query caching. What? You only need to cache some specific statements, and you don’t want to optimize a statement. All the statements in the application use the query cache. It is important to know that the query is also costly. For some statements, it is not helpful or even slows down. It happens that mqsql also supports caching only specified statements and other statements are not affected.

    A preliminary understanding of mysql's query cache:  

 mysql的查询缓存功能由query_cache_type系统变量控制,有如下取值:
   OFF:表示不开启查询
   ON: 表示对所有语句都开启查询缓存, 除非sql语句以SELECT SQL_NO_CACHE开头明确表示不对该句子缓存
   DEMAND: 当语句以select SQL_CACHE开头的才会缓存

     Code

1.找到你的mysql配置文件在 [mysqld] 块下添加以下内容
  [mysqld]
  query_cache_type=2
  
2. 然后重启你的mysql服务, 进入mysql命令行,执行如下命令
   show variables like "query_cache_type"; //再次确认下查询缓存是否开启
   show variables like "query_cache_size"; //确认查询缓存大小,注意需要大于0
   set global query_cache_size = 1048576 //设置查询缓存大小,可设置的值为1024倍数(单位byte)需要大于40k
3. 对于你要缓存的select语句使用 SELECT SQL_CACHE开头, 以下举个栗子:
   SELECT SQL_CACHE * from table_name order by create_time desc limit 50000, 10;
   当mysql看到SELECT SQL_CACHE开头时会去查询缓存中找有没有缓存过,没有则执行语句并将结果缓存起来以供后续相同的查询使用.

 

Guess you like

Origin blog.csdn.net/weixin_37281289/article/details/103681635