It is difficult to observe real-time operators in data warehouses. Come and try operator-level monitoring.

This article is shared from Huawei Cloud Community's " GaussDB (DWS) Monitoring Tool Guide (4) Operator Level Monitoring [Bloom!" GaussDB (DWS) Cloud Native Data Warehouse] ", author: The little black claw behind the scenes.

As the amount of data increases and the complexity of data processing increases, the performance issues of database systems become more and more prominent. The frequency and amount of data accessed by applications to the database are also increasing. Therefore, optimizing the performance of database systems has become an important task for database administrators and developers. Through SQL performance tuning, you can improve the response speed and throughput of the database system, reduce resource consumption, improve system stability and reliability, thereby improving application performance and user experience. At present, the existing explain tool of GaussDB (DWS) cannot meet users' needs for real-time location of problems. Therefore, DWS has launched operator-level monitoring to solve the problem of difficult observation of real-time operators.

1. Requirement description

For example, after a user issues a statement, he cannot know whether the execution plan generated by the current statement is reasonable, the execution progress and resource consumption of the current statement, etc. As shown in the figure below, the user can only see how long the execution took, but not what information was executed after the statement? How does the operator work? It is impossible to judge how each operator interacts and whether the generated plan is reasonable.

cke_114.png

To this end, DWS provides the explain performance method for post-mortem analysis. However, explain performance requires the statements to be executed before the results can be seen. For some new business statements, it is not known how long they will run, or even whether they can run and produce results. I don't know, so I can't analyze the results directly through explain performance.

Therefore, there is an urgent need for a means of observing the execution of statement operators in real time to determine the optimization points of the execution plan for SQL tuning.

2. Solution

In response to these situations, GaussDB (DWS) has launched operator monitoring in the new version 821. Operator monitoring can see the specific execution of statements, and can track the progress and resource consumption of a specific operator. The usage steps are as follows:

1) Set the guc parameter resource_track_level to the operator_realtime level, and then execute the statement;

2) Reopen a window, connect to gaussdb, and query all statements that open operator monitoring in the cluster through pgxc_wlm_operator_statistics, or query pg_stat_get_wlm_realtime_operator_info (queryid) to obtain the information corresponding to the queryid statement.

select * from pgxc_wlm_operator_statistics;

Note: This function has a certain impact on performance. Performing a baseline test may result in a maximum performance degradation of about 2% under the same circumstances. It is recommended that users use it when tracking performance issues.

The operator monitoring function is similar to the statement monitoring function, and also includes the static information and running state information of the statement.

1) Statement static information is information generated by the optimizer before the statement is actually executed, such as execution plan plan_node_name, queryid, estimated row number and other information. Can be used to analyze whether the generated execution plan is appropriate.

2) Statement dynamic information is the resource information occupied by the statement during execution in the executor, such as operator execution progress progress, memory peak_memory, operator bottom disk spill_size, network net_size, disk IO (read_bytes, write_bytes), CPU (cpu_time) ) and other real-time information records of different DNs. It can be used to analyze the progress and resource consumption during statement execution. This field can be used to analyze where the statement takes a long time to run, which facilitates subsequent optimization.

3. Actual use

We issue a query and query the operator view in another session. The results are as follows:

cke_115.png

1) Current operator progress: The progress field shows the running progress of the current operator. For the first operator, this field shows the overall progress of the current statement.

2) Continuously refresh the view, you can see the statement execution status, and observe the progress operator whose progress is between (0,100), which means that the operator is running.

3) Observe the actual resource consumption of the current operator and determine the cause of possible blocking.

4. Summary

In the past, operator running information could only be obtained after explain performance was executed. Now, it can be obtained directly during operation through this view, and this view has no impact on the result set. The view provided in this article can support users to monitor statement operators in real time, and can more accurately reflect the execution of statements. By observing operators that run for a long time and consume resources, you can judge whether the plan generation is reasonable, or through the progress field Observing statement execution progress can be used to locate SQL performance problems. Of course, this view may be used in conjunction with other running views to ultimately determine the cause of slow SQL performance and take steps to tune it.

Click to follow and learn about Huawei Cloud’s new technologies as soon as possible~

Microsoft launches new "Windows App" .NET 8 officially GA, the latest LTS version Xiaomi officially announced that Xiaomi Vela is fully open source, and the underlying kernel is NuttX Alibaba Cloud 11.12 The cause of the failure is exposed: Access Key Service (Access Key) exception Vite 5 officially released GitHub report : TypeScript replaces Java and becomes the third most popular language Offering a reward of hundreds of thousands of dollars to rewrite Prettier in Rust Asking the open source author "Is the project still alive?" Very rude and disrespectful Bytedance: Using AI to automatically tune Linux kernel parameter operators Magic operation: disconnect the network in the background, deactivate the broadband account, and force the user to change the optical modem
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4526289/blog/10149383