Execution plan cache, the secret of Prepared Statement performance jump

Abstract: Let's take a look at how GaussDB (for MySQL) caches the execution plan and accelerates the performance of Prepared Statement.

This article is shared from the HUAWEI CLOUD community " Execution Plan Cache, The Secret of Prepared Statement Performance Jump ", author: GaussDB database.

introduction

In the database system, after the SQL (Structured Query Language) statement is input into the system, it generally goes through the process of lexical analysis (parse), rewriting (resolve), optimization (optimize), and execution (execute). Lexical syntax analysis, rewriting and optimization, these three stages will generate the execution plan (plan) of the SQL statement. When there are multiple execution plans for SQL statements, the optimizer will select the one it thinks is the best (usually the one that takes up the least system resources, including CPU and IO, etc.) from the many execution plans as the final execution plan. Executor executes. The process of generating an execution plan can consume a lot of time, especially when there are many alternative execution plans.

Figure 1: SQL statement execution

Prepared Statement replaces the value in the SQL statement with a placeholder, which can be regarded as templated or parameterized SQL statement. When the PREPARE statement is executed, traditional MySQL will perform lexical analysis and rewrite the specified statement, as shown in the above figure ①②. This phase is called the precompilation phase. The advantage of Prepared Statement is that it compiles once and runs multiple times, which saves the time required for the precompilation stage. When the EXECUTE command is issued later, MySQL will optimize the structure generated in the compilation phase, that is, ③ in the above figure, generate the corresponding execution plan and execute it, and return the output result to the client. For example:

PREPARE stmt FROM ‘SELECT * FROM t WHERE t.a = ?’;
SET @var = 2;
EXECUTE stmt USING @var;

The prepared statement of traditional MySQL only saves the time required for parsing and rewriting the SQL statement, but for a SQL statement, as mentioned at the beginning of the article, optimizing the SQL statement and generating an execution plan requires a lot of resources and time. If the final execution plan corresponding to the Prepared Statement statement can be cached, when the EXECUTE statement is executed, the cached execution plan can be used directly, so that the entire process of generating the execution plan for the SQL statement can be skipped, and the statement can be improved. execution performance. To this end, GaussDB (for MySQL) provides the Prepared Statement execution plan cache feature.

Next, let's take a look at how GaussDB (for MySQL) caches the execution plan and accelerates the performance of Prepared Statement.

How execution plan caching works

The basic principle and process of GaussDB (for MySQL) caching Prepared Statement execution plans are shown in the following figure:

  • In response to EXECUTE, the query is executed.
  • Use the is_plan_cached process to check whether the execution plan of the current Query has been cached.
  • If it has been cached, the optimizer will initialize the current Query cached execution plan, restore the execution plan according to the context of the execution plan, and then use the restored execution plan to continue execution.
  • If it is not cached, after executing Query optimization to generate an execution plan, verify whether the current execution plan can be cached through the is_query_cachable process.
  • If the cache condition is met, the execution plan will be cached (call cache_JOIN_plan), so that future EXECUTE statements can use the cached plan for execution.
  • If it cannot be cached, execute the EXECUTE statement through the traditional MySQL execution process (optimization, generating an execution plan and then executing it).

Execution plan cache management

  • Execution plan cache function switch

GaussDB (for MySQL) introduces a new system parameter rds_plan_cache to switch the Prepared Statement execution plan cache function.

rds_plan_cache : This parameter can be set to ON/OFF. Represents enabling and disabling the execution plan cache respectively. This parameter is a Session/Global level parameter.

  • View execution plan cache

GaussDB (for MySQL) provides two status variables for users to view or verify whether the Prepared Statement execution plan is cached, and whether the cached execution plan is hit during execution.

  • cached_plan_count: Shows how many Prepared Statements have cached execution plans. This is a Global level state variable.
  • cached_plan_hits : Displays the number of cached execution plans hit during EXECUTE execution. This is a Session/Global state.

Let's take an example to see how Prepared Statement utilizes the execution plan cache feature:

SET @a = 'two';
SET @b = 3;
PREPARE stmt FROM "SELECT * FROM t1 WHERE b = ? AND c = ?";
EXECUTE stmt USING @a,@b;

The execution results are as follows:

a b c
6 two 3

Execute the Prepared Statement again:

EXECUTE stmt USING @a,@b;
a b c
6 two 3

Execute the Prepared Statement for the third time:

execute stmt using @a,@b;
a b c
6 two 3

Check whether the stmt execution plan is cached through cached_plan_count and cached_plan_hits, and whether the cached execution plan is hit during execution.

SHOW SESSION STATUS LIKE "cached_plan%";

The display results are as follows:

Variable_name Value
Cached_plan_count 1
Cached_plan_hits 2

It can be seen from the displayed results that when the EXECUTE statement is executed for the first time, the Prepared Statement caches the execution plan, that is, you can see that the Cached_plan_count is 1; after executing the EXECUTE statement twice, both hit the execution plan cache, so you can see Until Cached_plan_hits becomes 2.

How cached execution plans are invalidated

In order to keep the current cached execution plan as optimal as possible, GaussDB (for MySQL) defines the following rules to invalidate the current cached plan and regenerate the execution plan:

  • The number of records in the table related to the execution plan has changed by more than 20% of the total number of records.
    This means that if more than 20% of the records in the current table are inserted/deleted, the current cache plan will be invalidated and re-cached after optimization. Note: Number of records is estimated based on statistical data. So it is best to analyze the table first.
  • The table definition was changed.
    For example, DDL performed on tables related to the execution plan will cause the cached plan to be invalidated and re-cached after optimization.
  • If the value of an option affecting execution plan generation in the system variable Optimizer_switch is changed, the cached plan will be invalidated and re-cached after optimization.
  • When the system character set changes and is different from the cached plan, the cached plan will be invalidated and re-cached after optimization.

Some current limitations of the execution plan caching feature

The purpose of Prepared Statement of GaussDB (for MySQL) is to save query optimization time. For large queries optimized through parallel query, that is, queries with a relatively large amount of data, most of the execution time of these queries is concentrated in the execution phase of the execution plan. For this type of query, the optimization time is negligible compared to the execution time, so GaussDB (for MySQL) does not cache parallel query plans. In addition, GaussDB (for MySQL) is gradually enhancing its ability to cache execution plans for Prepared statements. For example, it currently only supports SELECT query statements for a single table, and does not support UNION operations for the time being.

Execution plan cache performance test results

For the scenario of using execution plan cache and not using execution plan cache, the performance test comparison was carried out based on the Sysbench test set. From the test results, it can be seen that after the execution plan cache is enabled, the performance of various services is improved. NOTE: These tests represent relative numbers only and do not represent actual performance.

The test environment configuration is as follows:

Dataset:  8 tables, 10 million rows per table
Test server: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz 2 physical cores 56 processors 460G memory

Summarize

GaussDB (for MySQL) can improve the performance of Prepared Statement by caching the execution plan. Especially for the test set of Range Scan, the performance can be improved by about 2 times. In the future, we will support more and more query scenarios, and performance acceleration is worth looking forward to.

 

Click to follow and learn about Huawei Cloud's fresh technologies for the first time~

{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4526289/blog/9803537