Find statements with high CPU usage of nodes in the GaussDB (DWS) cluster

Abstract: This article mainly uses examples to explain how to find high CPU usage statements through the gs_cpuwatcher.sh script.

This article is shared from the HUAWEI CLOUD community " GaussDB(DWS) gs_cpuwatcher.sh script how to find high CPU usage statements ", author: fightingman.

【tool name】

gs_cpuwatcher

【Function description】

1. Find the statement that the nodes in the cluster occupy high CPU

【scenes to be used】

  1. High CPU sys usage
  2. Overall business is slow

【Parameter Description】

none

【Instructions】

  1. Execute commands directly in the background

nohup sh gs_cpuwatcher.sh > cpuwatcher.log 2>&1 &

Notes before execution:

  • Use omm user (offline) or Ruby user (online) to execute
  • Put the script in a directory with sufficient disk space for execution to prevent the disk space from being filled up. Script monitoring will generate logs and occupy disk space. The disk space should preferably be greater than 20G
  • After monitoring, kill the monitoring process to prevent forgetting the script and causing the monitoring log to keep rising. The script keeps the log for 3 days by default
  • The script will query the statement with high cpu only when the cpu usage rate of the process is greater than 100 (multi-core cumulative sum)

【Best Practice & Result Analysis】

After executing the monitoring command, check the monitoring log generated in the current directory

Check the log cpu_watch_xxx.log, there are statements that record high CPU usage

The log records the statements with high CPU usage, such as select * from pg_class a, pg_class in the above figure, the script intercepts the first 50 characters of sql by default, you can modify the intercepted string, you need to modify the script

Field explanation:

  1. dur : execution time
  2. start: start time of sql
  3. state_change: sql state change time
  4. usename: username
  5. datname: connected database name
  6. query_id: the unique identification id of sql
  7. pid: thread id
  8. client_addr: the ip of the client connection
  9. state: the execution status of sql
  10. lwtid: thread ID
  11. wait_status: the wait status field in the wait view
  12. substr: sql field

 

Click to follow and learn about Huawei Cloud's fresh technologies for the first time~

{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4526289/blog/8631758