AI function of cutting-edge science and technology exploration: slow SQL discovery

SQLdiag: slow SQL found

SQLdiag is a tool for predicting the execution time of SQL statements in openGauss. Existing forecasting technologies are mainly based on execution plan forecasting methods, but these forecasting schemes are only applicable to OLAP scenarios and can obtain execution plan tasks, and are of little use value for fast and simple queries such as OLTP or HTAP. Different from the above solutions, SQLdiag focuses on the historical SQL statements of the database, summarizes the execution performance of historical SQL statements, and then uses them to infer new unknown services. Since there will not be much difference in the execution time of database SQL statements in a short period of time, SQLdiag can detect statement result sets similar to executed SQL statements from historical data, and predict SQL statement execution based on SQL vectorization technology and template method duration. This tool has the following advantages:

Execution plans for SQL statements are not required and will not have any impact on database performance.
It can be used in a wide range of scenarios. At present, many algorithms in the industry have relatively high limitations, such as only applicable to OLTP or OLAP, while SQLdiag has a wide range of usage scenarios.
The framework is easy to understand, and you can train your own prediction model with simple operations.
The typical application scenario of this tool is to perform perspective on a batch of SQL statements that will be launched soon, and identify risks in advance.

overview

SQLdiag is a tool for predicting the execution time of SQL statements. Through templating methods or deep learning methods, it can predict the execution time of SQL statements based on the similarity of statement logic and historical execution records without obtaining the execution plan of SQL statements. Abnormal SQL was found.

User guides

prerequisite

  • Need to ensure that users provide training data.
  • If the user collects training data through the provided tools, the WDR function needs to be enabled. The parameters involved are track_stmt_stat_level and log_min_duration_statement. See the summary below for details.
  • To ensure prediction accuracy, the historical statement log provided by the user should be as comprehensive and representative as possible.
  • Configure the python 3.6+ environment and its dependencies as required.

Environment configuration

The operating environment of this function requires Python version 3.6 and above. The required third-party dependencies are recorded in the requirements.txt file, and dependencies can be installed through the pip install command, such as:

pip install requirements.txt

SQL pipeline collection method

This tool requires the user to prepare data in advance. The format of the training data is as follows, and each sample is separated by a newline:

SQL,EXECUTION_TIME

The forecast data format is as follows:

SQL

SQL represents the text of the SQL statement, and EXECUTION_TIME represents the execution time of the SQL statement. For sample data, see train.csv and predict.csv in sample_data.

Users can collect training data by themselves according to the required format. The tool also provides automatic script collection (load_sql_from_rd). This script obtains SQL information based on the WDR report. The parameters involved are log_min_duration_statement and track_stmt_stat_level:

  • Among them, log_min_duration_statement indicates the slow SQL threshold, if it is 0, it will be collected in full, and the time unit is milliseconds;
  • track_stmt_stat_level indicates the level of information capture. It is recommended to set track_stmt_stat_level='L0,L0'
    . After the parameter is enabled, it may occupy a certain amount of system resources, but generally not much. Sustained high-concurrency scenarios may result in a loss of less than 5%. In scenarios with low database concurrency, the performance loss is negligible.
使用脚本获取训练集方式:
load_sql_from_wdr.py [-h] --port PORT --start_time START_TIME
                            --finish_time FINISH_TIME [--save_path SAVE_PATH]
例如:
    python load_sql_from_wdr.py --start_time "2021-04-25 00:00:00" --finish_time "2021-04-26 14:00:00" --port 5432  --save_path ./data.csv

Steps

  1. Provide historical logs for model training
  2. Perform training and prediction operations:
基于模板法的训练与预测:
    python main.py [train, predict] -f FILE --model template --model-path template_model_path 
基于DNN的训练与预测:
    python main.py [train, predict] -f FILE --model dnn --model-path dnn_model_path

Example of usage

In the root directory of this tool, execute the following statements to realize the corresponding functions.

Use the provided test data for templated training:

python main.py train -f ./sample_data/train.csv --model template --model-path ./template 

Use the provided test data to make templated predictions:

python main.py predict -f ./sample_data/predict.csv --model template --model-path ./template --predicted-file ./result/t_result

Use the provided test data for templated model updates:

python main.py finetune -f ./sample_data/train.csv --model template --model-path ./template 

DNN training using the provided test data:

python main.py train -f ./sample_data/train.csv --model dnn --model-path ./dnn_model 

DNN predictions using the provided test data:

python main.py predict -f ./sample_data/predict.csv --model dnn --model-path ./dnn_model --predicted-file 

DNN model update using provided test data:

python main.py finetune -f ./sample_data/train.csv --model dnn --model-path ./dnn_model

get help

Before using the SQLdiag tool, you can use the following commands to get help.

python main.py --help 

The following help information is displayed:

usage: main.py [-h] [-f CSV_FILE] [--predicted-file PREDICTED_FILE]
               [--model {template,dnn}] --model-path MODEL_PATH
               [--config-file CONFIG_FILE]
               {train,predict,finetune}

SQLdiag integrated by openGauss.

positional arguments:
  {train,predict,finetune}
                        The training mode is to perform feature extraction and
                        model training based on historical SQL statements. The
                        prediction mode is to predict the execution time of a
                        new SQL statement through the trained model.

optional arguments:
  -h, --help            show this help message and exit
  -f CSV_FILE, --csv-file CSV_FILE
                        The data set for training or prediction. The file
                        format is CSV. If it is two columns, the format is
                        (SQL statement, duration time). If it is three
                        columns, the format is (timestamp of SQL statement
                        execution time, SQL statement, duration time).
  --predicted-file PREDICTED_FILE
                        The file path to save the predicted result.
  --model {template,dnn}
                        Choose the model model to use.
  --model-path MODEL_PATH
                        The storage path of the model file, used to read or
                        save the model file.
  --config-file CONFIG_FILE

command reference

Table 1 Description of command line parameters

parameter Parameter Description Ranges
-f training or prediction file location
–predicted-file Prediction result storage location
–model model selection template,dnn
–model-path training model storage location

FAQ

Failed to connect to the database instance: Please check the situation of the database instance, whether there is a problem with the database instance or the security permission configuration (configuration items in the pg_hba.conf file) is incorrect.
Failed to restart: Please check the health status of the database instance to ensure that the database instance is working properly.
Dependency installation failed: It is recommended to upgrade the pip package management tool first, through the command python -m pip install –upgrade pip.
When running TPC-C jobs, it is found that the performance is getting slower and slower: Stress tests in high-concurrency scenarios such as TPC-C are often accompanied by a large number of data modifications. Since each test is not idempotent (increase in the data volume of the TPC-C database, no vacuum full is performed to clean up invalid tuples, the database does not trigger checkpoint, no drop cache is performed, etc.), it is generally recommended that TPC-C be accompanied by Benchmarks with a lot of data written should re-import the data every once in a while (depending on the specific concurrency and execution time). The simpler method is to back up the $PGDATA directory.
When TPC-C runs a job, the TPC-C driver script reports an exception "TypeError: float() argument must be a string or a number, not 'NoneType'" (cannot convert None to float type): this is because no The TPC-C pressure test returns results. There are many reasons for this problem. Please manually check whether TPC-C can run through and obtain the returned results. If there are no above problems, it is recommended to set the delay time of the "sleep" command in the command list in the TPC-C driver script to be larger.

Guess you like

Origin blog.csdn.net/GaussDB/article/details/123889483