The theory of data warehouse resource management and control has been mastered, it is time for actual combat

This article is shared from HUAWEI CLOUD community " Review of Live Streaming | Data Warehouse Resource Management and Control Theory Has Been Mastered, It's Time for Real Combat ", author: Hu Latang.

In a mixed load scenario, how to efficiently operate and maintain the database to prevent the database system from being overloaded? What assistance does GaussDB (DWS) resource management provide for the stable and reliable operation of the database? In the live broadcast of this issue of "Data Warehouse Experts Teach You Resource Management and Operation and Maintenance Practices", we invited Huawei Cloud GaussDB (DWS) technical evangelist Lu Pengbo to discuss the principles of GaussDB (DWS) resource management and system operation and maintenance practices. Developers communicate and interact with partners and friends.

GaussDB (DWS) software architecture and operation and maintenance challenges

The core components of GaussDB (DWS) mainly include CN, DN and corresponding auxiliary components GDS, OM, CM and GTM. In this live broadcast, resource management and control mainly focus on CN and DN. CN is the coordinating node of GaussDB (DWS), responsible for parsing and optimizing SQL statements and generating execution plans. DN is the data storage node, which is responsible for executing the SQL execution plan and returning the execution result to CN. For example, if a business application sends a SQL statement to a GaussDB (DWS) cluster, it will first be load balanced and randomly distributed to a certain CN, and then the CN will generate the parsing statement and the corresponding execution plan, and send it to the DN for execution, etc. After the DN is executed, it will be returned to the CN, and then returned to the user from the CN.

GaussDB (DWS) adopts a shared nothing architecture, which supports node expansion and improves the performance of GaussDB (DWS) overall cluster. In the process of using the data warehouse, there are also some challenges, such as: sending many SQL statements at the same time, resulting in statement execution resource preemption, database overload, and large SQL execution errors; The process is a black box, unable to see the execution of statements and resource usage. Therefore, the function of resource management and control can be used to solve these two challenges.

cke_142.png

GaussDB (DWS) resource control function panorama

Introduction to the principle of GaussDB (DWS) resource management and control functions

After the user sends the SQL statement, it first goes through load management, and then sends it to the DN for job execution. During the execution process, it will be subject to resource control. The resource control here mainly includes CPU control and space control. In addition, during the execution process, there will be an auxiliary thread for data collection to help users collect the computing resources occupied in real time during the statement execution process, including CPU, memory, disk IO, network, etc. At the same time, GaussDB (DWS) provides some operation and maintenance tools, such as TopSQL view and resource monitoring view to analyze and locate problems. In addition, in order to prevent bad SQL from affecting the overall performance of the cluster, GaussDB (DWS) provides the function of exception rules. There are some exception rules in the cluster by default, and users can also customize some exception rules, such as the execution time and queuing time. If the limit is exceeded, the job will be stopped or downgraded. Load management will first go through a global concurrent queue, and then enter the GaussDB (DWS) resource pool queue. During this process, load management divides jobs into short queries and long queries. Short queries are also called simple queries, and long queries Queries are also known as complex queries. The short query function is enabled to improve user query performance and improve operating efficiency.

Load control: Query scheduling, including concurrency-based query scheduling and estimated memory-based query scheduling. The peak-staggered execution of queries is realized through query scheduling, which prevents serious resource contention caused by excessive concurrency, resulting in query accumulation.

cke_143.png

CPU control: GaussDB (DWS) provides users with CPU control methods to adjust the use of the user's CPU, which are called shared quotas and exclusive quotas. Among them: shared quotas are used to assign certain weights to resource pools according to percentages, and shared quotas do not limit resource pools. The CPU core used, when a certain CPU is fully loaded , the resource pool running jobs on this CPU will preempt the CPU time slice according to the weight ratio. resources, prompting high-quality jobs to run first. The exclusive quota is to allocate CPU cores to the resource pool according to the percentage. The complex jobs running on the resource pool can only run on the allocated CPU. The usage scenario is that the CPU resources are relatively sufficient and the impact on the business is sensitive, so it needs to be closed. Resource pool short query acceleration switch. The two management and control schemes have their own applicable scenarios, which require specific analysis of specific scenarios.

Memory control:  GaussDB (DWS) provides users with two memory control methods. Users can reasonably set the memory ratio at the resource pool level according to business needs. Aiming at the drawbacks of traditional memory management, GaussDB (DWS) has designed and implemented memory self-adaption technology, which removes the dependence on work_mem. The optimizer estimates the memory used by the query based on statistical information. During the execution of SQL by the executor, if the used memory exceeds the estimated memory That is, the disk is triggered; resource management schedules and controls queries based on the query memory estimated by the optimizer.

Space control: supports multi-dimensional space control capabilities, including user space control , Schema space control , single SQL space control , disk space control and other multi-dimensional space control capabilities.

cke_144.png

Network management and control: In a distributed environment, the quality of the network has a crucial impact on query performance. For this reason, GaussDB (DWS) proposes a priority + query weighted round-robin DWRR algorithm to control network traffic between nodes. And restrict user queries through exception rules to prevent business from being affected by network congestion. Network management and control can reasonably configure network bandwidth, smooth data loading, and avoid congestion. Identify redundant and low-value data transfers and limit their bandwidth usage. At the same time, it can give higher priority to network resources for key queries. The network traffic control of the data warehouse is of great help to optimize data loading, improve query response, and prevent failures.

cke_145.png

GaussDB (DWS) resource operation and maintenance tool

GaussDB (DWS) provides a lot of operation and maintenance tools to improve the ability of problem location and analysis. At present, it has integrated various operation and maintenance methods such as before, during and after the event, including the explain performance analysis execution plan before the event. After the job runs, You can use pgxc_stat_activity to analyze active session information, pgxc_thread_wait_status to analyze thread waiting information, pgxc_wlm_session_statistics to analyze running job information, etc. After the job finishes running, you can also use pgxc_wlm_session_info to analyze the execution of historical statements, pgxc_respool_resource_history to analyze the occupancy information of historical resource pools, and pgxc_wlm_user_resource_history to analyze resource usage in the user dimension, etc.

cke_146.png

These operation and maintenance views provide an access interface to the internal data of the database, and can obtain data related to operation and maintenance such as execution plans, session information, and performance statistics. You don't need to understand the complex internal storage structure, you can directly query the view table to get the required information. The query cost is low, and there is no need to scan the entire table every time, which improves access efficiency. It has better encapsulation and protects the security of metadata inside the database. The returned information can be filtered as needed, making it easier for operation and maintenance personnel to use. By joining other view tables, comprehensive cross-domain information can be obtained for analysis. Some views contain additional insights and statistics, providing a basis for optimization and diagnosis. Enables the database to be better understood, managed and monitored. All in all, the database operation and maintenance view greatly facilitates the daily monitoring of the database, performance tuning, fault location and other management tasks, and is an important tool for database operation and maintenance.

In general, resource management and control can improve resource utilization, ensure service quality, smooth peak pressure, realize multi-tenant isolation, optimize resource allocation strategies, reduce operation and maintenance workload, and improve user experience. in particular:

  • Resource management and control can improve resource utilization by monitoring and limiting the occupation of resources by non-core businesses;
  • Provide resource guarantee for key business and important inquiries, and prevent insufficient resources from affecting service quality;
  • For periodic peaks of data loading and user queries, scientifically allocate resources to smooth peak pressure;
  • Realize resource usage isolation and interference prevention between different user groups or tenants;
  • Discover potential problems in time through resource monitoring and respond quickly;
  • Adjust resource allocation according to the actual situation to make resource planning more reasonable; resource control reduces the manual requirements for database tuning and problem location, and reduces the workload of operation and maintenance;
  • Users can directly perceive the improvement of system stability brought about by resource management and control.

To sum up, a good resource management and control mechanism can greatly reduce the labor cost of managing data warehouses, and also improve users' trust in data warehouse services.

Interested developers are welcome to watch the live replay for details. For more technical analysis of GaussDB (DWS) products and introduction of new features of data warehouse products, please pay attention to the GaussDB (DWS) forum. Technical blog post sharing and live broadcast arrangements will be posted on the GaussDB (DWS) forum as soon as possible.

Forum Link: HUAWEI CLOUD Forum_Cloud Computing Forum_Developer Forum_Technology Forum-HUAWEI CLOUD

Live playback link: Data Warehouse experts teach you resource management and operation and maintenance in practice_DTT_Live_Cloud Community_HUAWEI CLOUD

Extra!

cke_11043.jpeg

Huawei will hold the 8th HUAWEI CONNECT 2023 at the Shanghai World Expo Exhibition Hall and Shanghai World Expo Center on September 20-22, 2023. With the theme of "accelerating industry intelligence", this conference invites thought leaders, business elites, technical experts, partners, developers and other industry colleagues to discuss how to accelerate industry intelligence from business, industry, and ecology.

We sincerely invite you to come to the site, share the opportunities and challenges of intelligentization, discuss the key measures of intelligentization, and experience the innovation and application of intelligent technology. you can:

  • In 100+ keynote speeches, summits, and forums, collide with the viewpoint of accelerating industry intelligence
  • Visit the 17,000-square-meter exhibition area to experience the innovation and application of intelligent technology in the industry at close range
  • Meet face-to-face with technical experts to learn about the latest solutions, development tools, and hands-on
  • Seek business opportunities with customers and partners

Thank you for your support and trust as always, and we look forward to meeting you in Shanghai.

Conference official website: HUAWEI CONNECT 2023 | HUAWEI CONNECT 2023

Welcome to follow the "Huawei Cloud Developer Alliance" official account to get the conference agenda, exciting activities and cutting-edge dry goods.

Click to follow and learn about Huawei Cloud's fresh technologies for the first time~

Guess you like

Origin blog.csdn.net/devcloud/article/details/132696872