SQL IN necessarily take the index it?

Summary

IN necessarily take the index it? That, of course, do not go all the index can scan it? Prior seems to have seen anything Exist, IN do not go away to discuss index. But it seems to look too long, I forget. Haha, if you have forgotten how in MySQL IN query, it is to review under the bar.

problem

Statistics from previous issues店铺数关注人数说起

SELECT shop_id, count(user_Id) as attentionNumber
FROM shop_attention
WHERE shop_id IN
<foreach collection="shopIds" item="shopId" separator="," open="(" close=")">
    #{shopId}
</foreach>
GROUP BY shopId

From the perspective of the cache it was to analyze how to optimize. Interested in seeing this after service of micro cache how do

The convergence of this query, the application side caching done, really no big problem. But with the increase in the number of outlets attention, it began a slow SQL

In our business, the SQL query is defined as 100ms slow queries, need to be optimized. Can not optimize the query must be controlled frequency. At the same time more than 5s database operation will kill off, to prevent the collapse of the entire database, resulting in related applications have been implicated.

The SQL has several hundred ms execution time consuming, and must be optimized. Ali cloud detection report for this time of SQL

  1. Returns the number of rows and the number of scanning lines of the ratio exceeds 100
  2. Use the group_by function, pay attention to check whether the use of the index group_by

analysis

First, it is certain that, group by the shop_idfield index must be built, then the number of scanning lines and the number of rows returned proportion Why so big?

Analysis under the first review of the three elements of the query

  1. Response time, meaning is clear, much explanation
  2. The number of scanning lines scanned during the whole query how many lines
  3. The number of rows number of rows returned query results hit the
    general number of scan lines and the number of rows returned, is the best, but this is the ideal situation, it is not. Relational query / queries are ordered so that the range of the number of scanning lines larger than the number of rows returned. Generally this ratio should be controlled below 10, or it may have performance problems.

The way, I have always felt mysql explain the show as good as mongo visual field. mongo indexing works with mysql the same, are interested can look Mongo Index analysis

So now the question is, why the number of scan lines query / return ratio of the number of rows so big.

Then explain what the

Experiment 1

SELECT shop_id, count(user_Id) as attentionNumber
FROM shop_attention
WHERE shop_id IN(1,2,3)
GROUP BY shopId
type possible_keys key key_length ref rows Extras
range idx_shop idx_shop 8 null 16000 Using index condition

And I expected the same type rangeto go index shopId, no problems. That how scanning lines / return ratio of the number of rows so much.

Experiment 2

Try one, the IN range increases.

SELECT shop_id, count(user_Id) as attentionNumber
FROM shop_attention
WHERE shop_id IN(1,2,3,4,5,6,7,8,9)
GROUP BY shopId
type possible_keys key key_length ref rows Extras
index idx_shop idx_shop 8 null 303000 Using where

The results are not the same, the type index, that is, did not take the scope of the scan, but is taking the index scan.

Experiment 3

Forced to take the index

SELECT shop_id, count(user_Id) as attentionNumber
FROM shop_attention force index(idx_shop)
WHERE shop_id IN(1,2,3,4,5,6,7,8,9)
GROUP BY shopId
type possible_keys key key_length ref rows Extras
range idx_shop idx_shop 8 null 29000 Using Index Condition

This time taking the scanning range, rather than an index scan. But you will find that this is the execution time is not shorter than the no-time execution time .

mysql this query is optimized so that it does not go scanning range. But taking the index scan. As more and more it is bound IN conditions,
the more the number of lines scanned, the longer perform.

So optimization approach to this problem, that is, to do the cutting in the application side, in batches to investigate. Every time the N investigation to ensure that each query quickly.

to sum up

According to the actual situation, it is necessary to control the range of IN queries. The following reasons

  1. IN conditions too much will lead to failure of the index, take the index scan
  2. IN excessive conditions, the data returned will be many, may cause a memory leak in the application heap.

It is necessary to control the number of queries a good IN

No. [public concern] abbot of the monastery, the first time to receive the updated article, beginning with the way technology practice of the abbot
Here Insert Picture Description

Guess you like

Origin www.cnblogs.com/stoneFang/p/11032746.html