The bug of random function rand as subquery???

Environmental information:

System version: CentOS release 6.10

Configuration: virtual machine, 64G memory, 16C logic CPU

Version: 5.7.12-log MySQL Community Server (GPL)

pt-query-digest mysql-slow.log --since '2020-11-01 00:01:00' --until '2020-11-09 10:00:00' >slow_report1.log

# Attribute    pct   total     min     max     avg     95%  stddev  median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count          5     302
# Exec time     11  37083s      5s    673s    123s    246s     88s    107s
# Lock time      1    78ms   153us   586us   257us   366us    62us   247us
# Rows sent      0     302       1       1       1       1       0       1
# Rows examine  91 162.88G  23.16M   2.96G 552.29M   1.08G 395.64M 483.92M
# Query size    17  84.94k     288     288     288     288       0     288
# Query_time distribution
SELECT
            *
            FROM
            t_test_info
            WHERE id >=
            (SELECT
            FLOOR(MAX(id) * RAND())
            FROM
            t_test_info)
            AND area_code = '5400'
            ORDER BY id
            LIMIT 1\G

The above slow SQL has an average execution time of 123s. It looks like a very simple SQL. The basic logic is: query the t_test_info table id greater than or equal to its own largest id multiplied by a random number to round, and the area number is 5400. After sorting by id, fetch the first row of data. The data volume of t_test_info is 6w+, which is not very large.

The table structure is as follows:

CREATE TABLE `t_test_info` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT '主键',
  `con_id` bigint(20) DEFAULT NULL COMMENT '连接id',
  ........,
  `area_code` varchar(8) DEFAULT NULL COMMENT '地区编码',
  `ctime` datetime DEFAULT NULL COMMENT '创建日期',
  ........,
  PRIMARY KEY (`id`) USING BTREE,
  ........,
  KEY `idx_con_id` (`con_id`) USING BTREE,
  KEY `idx_ctime` (`ctime`)
) ENGINE=InnoDB AUTO_INCREMENT=94681 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci ROW_FORMAT=DYNAMIC;

The corresponding execution plan:

id select_type      table     partitions  type   partition_key  key    key_len  rows    filtered   Extra
1 PRIMARY         t_test_info          index            PRIMARY    8      1      10      Using where
2 UNCACHEABLE SUBQUERY t_test_info         index            idx_ctime  6      63713    100      Using index

UNCACHEABLE SUBQUERY: For the outer main table, the result of a subquery cannot be cached and needs to be calculated each time (time-consuming operation)

It is found that there is a problem with this execution plan. The index idx_ctime is in the ctime column, and the ctime column does not appear in the entire SQL. Is it because the optimizer chose the wrong index? Consider ingore index(idx_ctime) to intervene, and find that it chooses the idx_con_id index, ignore considering ingore index(idx_ctime) to intervene, and find that it chooses idx_con_id index and then choose the next one. So from the execution plan, although the index is used, the filtered filter is 100% of the data, which is no different from the full table scan.

Here also consider that select * full-field data, replaced with select id to reduce the time of Sending data in the SQL execution phase, the execution time is the same, there is no difference.

FORMAT=JSON format execution plan, check resource consumption, according to the logic of SQL, see resource consumption is normal (first perform the query outside the subquery, get a result set of 63713 rows of data, and then each row of this result is followed by select Match the result set of the subquery)

{
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "13801.60"
    },
    "ordering_operation": {
      "using_filesort": false,
      "table": {
        "table_name": "t_test_info",
        "access_type": "index",
        "key": "PRIMARY",
        "used_key_parts": [
          "id"
        ],
        "key_length": "8",
        "rows_examined_per_scan": 1,
        "rows_produced_per_join": 6371,
        "filtered": "10.00",
        "cost_info": {
          "read_cost": "12527.34",
          "eval_cost": "1274.26",
          "prefix_cost": "13801.60",
          "data_read_per_join": "12M"
        },
        "used_columns": [
          "id",
          "con_id",
          .......
          "area_code",
          "ctime",
          .......
        ],
        "attached_condition": "((`zw`.`t_test_info`.`id` >= (/* select#2 */ select floor((max(`zw`.`t_test_info`.`id`) * rand())) from `zw`.`t_test_info`)) and (`zw`.`t_test_info`.`area_code` = '5400'))",
        "attached_subqueries": [
          {
            "dependent": false,
            "cacheable": false,
            "query_block": {
              "select_id": 2,
              "cost_info": {
                "query_cost": "13801.60"
              },
              "table": {
                "table_name": "t_test_info",
                "access_type": "index",
                "key": "idx_ctime",
                "used_key_parts": [
                  "gmt_created"
                ],
                "key_length": "6",
                "rows_examined_per_scan": 63713,
                "rows_produced_per_join": 63713,
                "filtered": "100.00",
                "using_index": true,
                "cost_info": {
                  "read_cost": "1059.00",
                  "eval_cost": "12742.60",
                  "prefix_cost": "13801.60",
                  "data_read_per_join": "126M"
                },
                "used_columns": [
                  "id"
                ]

Turn on optimizer_trace to trace and analyze the process of executing the sql:

set optimizer_trace='enabled=on';
set optimizer_trace_max_mem_size=10000;
SELECT *  FROM
            t_test_info
            WHERE id >=
            (SELECT
            FLOOR(MAX(id) * RAND())
            FROM
            t_test_info)
            AND area_code = '5400'
            ORDER BY id
            LIMIT 1;
select * from information_schema.optimizer_trace;
部分结果：  
    "join_execution": {
        "select#": 1,
        "steps": [
          {
            "subselect_execution": {
              "select#": 2,
              "steps": [
                {
                  "join_execution": { 
                    "select#": 2, 
                    "steps": [ 
          }, 
          
        - The same content appears 3039 times. It feels abnormal here. The subquery is being executed all the time, and it is stuck in a certain loop. 
        
          { 
            "subselect_execution": { 
              "select#": 2, 
              "steps": [ 
                { 
                  "join_execution": { 
                    "select#": 2, 
                    "steps": [ 
          }

At the same time use pstask to record stack information

for((a=0;a<1;a--));
do
pstack 19175 >> /tmp/pstack_1110.log
done

Here, 19175 is to query the THREAD_OS_ID corresponding to the SQL according to performance_schema.threads.

Summarize trace information through pt-pmp pt-pmp /tmp/pstack_1110.log:

/tmp/pstack_1110.log Part of the information

Thread 1 (process 19175):
#0  0x00000000010afbf7 in row_search_mvcc(unsigned char*, page_cur_mode_t, row_prebuilt_t*, unsigned long, unsigned long) ()
#1  0x0000000000fc357e in ha_innobase::general_fetch(unsigned char*, unsigned int, unsigned int) ()
#2  0x0000000000815551 in handler::ha_index_next(unsigned char*) ()
#3  0x0000000000cd811c in join_read_next(READ_RECORD*) ()
#4  0x0000000000cdcc06 in sub_select(JOIN*, QEP_TAB*, bool) ()
#5  0x0000000000cdbb7a in JOIN::exec() ()
#6  0x0000000000c0e6b0 in subselect_single_select_engine::exec() ()
#7  0x0000000000c10696 in Item_subselect::exec() ()
#8  0x0000000000c0cd69 in Item_singlerow_subselect::val_real() ()
#9  0x0000000000841080 in Arg_comparator::compare_real_fixed() ()
#10 0x0000000000849592 in Item_func_ge::val_int() ()
#11 0x000000000082592c in Item::val_bool() ()
#12 0x000000000084364a in Item_cond_and::val_int() ()
#13 0x0000000000cdb274 in evaluate_join_record(JOIN*, QEP_TAB*) ()
#14 0x0000000000cdcc53 in sub_select(JOIN*, QEP_TAB*, bool) ()
#15 0x0000000000cdbb7a in JOIN::exec() ()
#16 0x0000000000d45db0 in handle_query(THD*, LEX*, Query_result*, unsigned long long, unsigned long long) ()
#17 0x0000000000d06e93 in execute_sqlcom_select(THD*, TABLE_LIST*) ()
#18 0x0000000000d0a82a in mysql_execute_command(THD*, bool) ()
#19 0x0000000000d0c52d in mysql_parse(THD*, Parser_state*) ()
#20 0x0000000000d0d6be in dispatch_command(THD*, COM_DATA const*, enum_server_command) ()
#21 0x0000000000d0e324 in do_command(THD*) ()
#22 0x0000000000dded7c in handle_connection ()
#23 0x00000000012436d4 in pfs_spawn_thread ()
#24 0x0000003f65407aa1 in start_thread () from /lib64/libpthread.so.0
#25 0x0000003f650e8c4d in clone () from /lib64/libc.so.6

According to the statistics of pt-pmp, there are as many as 237 times of the above. The feeling here is that it is trapped in a certain internal function loop and requires multiple judgments before returning. If you want to find the root cause, you can only look at the source code. Personal abilities are limited. First record it. If you are interested, you can study it.

At this point, I considered replacing the rand function with a constant, checking the execution plan and executing it, and found that it was different, and the execution time was also in milliseconds.

EXPLAIN SELECT
*  FROM  t_test_info   WHERE  id >= (SELECT  FLOOR(MAX(id) * 0.02)  FROM  t_test_info)
AND area_code = '5400'  ORDER BY  id  LIMIT 1;

id select_type  table  partitions type partition_key key    key_len  rows   filtered   Extra
1 PRIMARY     t_test_info       range PRIMARY    PRIMARY  8     31856 10            Using where
2 SUBQUERY                                                         Select tables optimized away

In order to further guess that the random function is used as a subquery bug (it may also be an UNCACHEABLE SUBQUERY problem, I prefer this reason, the ability is limited, and it is still inconclusive, the great god can correct it), then it may exist in the current version 5.7.12. Other versions have been fixed, so I found a 8.0.20 version of the library in the test environment, built the same table structure, and created the same amount of data.

First compared the execution plan and found that the execution plan is the same. The execution plan of version 8.0.20:

id select_type      table     partitions  type   partition_key  key    key_len  rows    filtered   Extra
1 PRIMARY         t_test_info          index            PRIMARY    8      1      10      Using where
2 UNCACHEABLE SUBQUERY t_test_info         index            idx_ctime  6      63713    100      Using index

The difference is that in the 8.0.20 environment, the execution time is about 0.03 seconds, which is tens of thousands of times worse. The difference is too big.

Going back to the 5.7.12 environment, I rewritten it based on the logic implemented in SQL to avoid using the rand function as a subquery. The idea of rewriting a subquery to join is as follows:

SELECT 
a.*
FROM
t_test_info a join
(
SELECT
FLOOR(MAX(id) * RAND()) xx
FROM
t_test_info
) b
where  a.id>=b.xx
AND a.area_code = '5400'
ORDER BY
a.id
LIMIT 1;

The above SQL can produce results in milliseconds in the 5.7.12 environment, but this SQL will produce Cartesian products. If the amount of data is large, performance will also decrease. Not a suitable solution.

Therefore, according to the idea of avoiding the rand function as a subquery, split into two SQLs, first obtain a random number, and query by passing parameters. This is also relatively well implemented in the code layer.

SELECT
FLOOR(MAX(id) * RAND()) into @str
FROM
t_test_info;

SELECT
*
FROM
t_test_info
WHERE
id >= @str
AND area_code = '5400'
ORDER BY
id
LIMIT 1;

In versions above 8.0, you can consider using with as to build a temporary table for rewriting. 5.7 does not support this syntax.

WITH cc AS (
SELECT
floor(max(id) * rand())
FROM
t_test_info
) SELECT
*
FROM
t_test_info
WHERE
area_code = '5400'
AND id >= (SELECT * FROM cc)
ORDER BY
id
LIMIT 1;

Postscript, follow-up refer to the official information, it is more inclined to be caused by the implementation plan UNCACHEABLE SUBQUERY.

https://dev.mysql.com/doc/refman/5.7/en/explain-output.html

https://dev.mysql.com/doc/refman/5.7/en/query-cache-operation.html

Also refer to "MySQL: The Influence of the Number of Query Fields on Query Efficiency"

The bug of random function rand as subquery???

Guess you like