table of Contents

Isolation of system memory resources by resource groups

All systems that share resources face a huge challenge—resource management. When multiple statements of different users are running in the database at the same time, the database needs to cleverly allocate resources such as CPU and memory among these statements of these users to ensure the smooth operation of these statements. Resource management needs to maintain a delicate balance between fairness, certainty, and maximum resource utilization, and give users some configuration options to adapt to different usage scenarios. Fairness and certainty means that a user or a sentence can guarantee that a part of the resource will be obtained, and this part of the resource will not be deprived at any time. In this way, the user's sentence is equivalent to retaining a minimum set of available resources, regardless of this sentence Early and late, fast and slow, at least this part of the resources can be obtained. Maximum resource utilization means that when the system is relatively idle, a statement can make full use of these idle resources to quickly complete execution.

CPU and memory resources slightly

CPU resources can be scheduled according to time slices.

Relatively speaking, adjusting the CPU usage ratio of a process can be done quickly. Once the memory resources are allocated and used, the release either takes a long time to complete or is too complicated to implement, so it is difficult to dynamically adjust the memory resources. Greenplum 5 implements CPU resource management based on Linux cgroup and memory resource management that is completely tracked, recorded and allocated by Greenplum . This new resource management is called ResourceGroup .

The resource group fully weighs different implementation methods, and realizes fair and efficient management according to the different characteristics of CPU and memory resources.

Cgroup is a good resource allocation feature that has entered the Linux kernel. It allows users to create different cgroup groups and limit the amount of CPU usage in the configuration file. When users connect different processes to the corresponding cgroup groups, the CPU resources that these processes can use will be limited to or near the configured data, thus ensuring the isolation of CPU resources between different groups.

Cgroup also allows a group of processes to use CPU resources exceeding the quota, as long as the CPU usage of other groups in the system is low or the entire system has spare CPU resources . This feature is ideal for database applications. Resource groups take full advantage of this feature. For example, users can create two resource groups through the following SQL statements:

For this configuration, the user tom1 is associated with the resource group rg1, and all his queries will run in the resource group rg1, similar to the user tom2. In this way, when the system is relatively idle, for example, when there are no statements running in rg2 and other groups, user tom1 can actually use more CPU resources, and can even exceed the CPU resource limit defined by resource group rg1 by 50%, for example, 60% may be used ~70%. If the entire system is busy and there are enough statements in other groups or rg2 to execute, then the statements in rg1 will immediately shrink the usage of the system's CPU, reducing it to the limited 50%. This perfectly solves the problem of maximum utilization of resources and fairness. This feature can be observed more intuitively from the figure.

This CPU feature implemented by the resource group can take good care of all kinds of loads, especially short queries. A short query is relative to a long query. A long query is a complex query that can run for many minutes or even hours. It consumes a lot of system resources or even takes up all system resources. The short query is a few millisecond or second-level short queries, the amount of data involved is small, and only a small amount of CPU and memory resources can be completed. For a data product that supports mixed loads, long queries and short queries in the system will appear irregularly. A pain point for many users is that when a long query is running in the system, this statement may occupy too much system Resources such as CPU or memory make short queries unable to obtain even a small part of the resources, which manifests as slow running and longer execution time. Combining resource groups in Greenplum can avoid this problem.

For the above resource groups rg1 and rg2, if you run a short query in rg2 and a long query in rg1, then when there is no short query, the long query will use more CPU resources;

Once a short query reaches rg2, the long query will immediately give up the excess CPU resources, and the short query can immediately enjoy 20% of the CPU resources that it should use. In our test, this resource adjustment will be completed in milliseconds .

The figure below is a summary of the test data. It can be seen that there is almost no difference in the execution time of the short query when there is a long query and when there is no long query. If the user sets the CPU resource of rg2 that runs short queries to a very large value (for example, 60% or higher), then when there is no short query, other groups can still share this part of the CPU; when there is a short query, the short query can enjoy Most of the CPU resources to complete quickly, so as to achieve a good feature biased towards short queries.

A new feature cpuset added in Greenplum 5.9 can better guarantee resources for short queries. When a user creates a resource group rg3 similar to the following, all statements running in rg3 will be scheduled to run on CPU core 1, and statements in rg3 are exclusive to CPU core 1, and statements in other groups can only be Dispatch to other cores. When a short query needs to be run in rg3, it will run on core 1 immediately, without worrying about scheduling the process of the big query first. A customer tested the characteristics of cpuset and found that it can significantly reduce the response time of short queries.

Isolation of system memory resources by resource groups

For rg1 and rg2, they will use 30% and 20% of the system memory, respectively. If the total memory used by the statements of the respective groups exceeds the quota, the global shared memory of the entire system will be preempted . If the global shared memory has been used up, then this statement will fail, thereby ensuring the safety of the entire cluster. The system's global shared memory is the unallocated memory remaining after the resource group is created. This part of the shared memory provides a certain degree of flexibility for mixed loads. If the user can predict the memory requirements of each group well, the system memory can be allocated among the groups, so that the memory usage of each group can only be limited to the limit of its own group. If the user's business and the load of each group are not completely determined, or there are huge queries that occur from time to time that require a particularly large amount of memory resources, part of the shared memory can be reserved for sharing among all groups. This part of the statement that is uncertain but may require a large amount of memory resources, if you have used up the memory share in your group, you can use this part of the shared resources to continue execution . For the memory allocation between statements in the group, the resource group will also fully weigh various factors. When creating a resource group, you can pass

memory_shared_quota
memory_spill_ratio

Make fine adjustments to the memory usage in the group, for example:

memory_shared_quota divides the memory in a resource group into two parts:

Shared part of each sentence in the group
A fixed part of each statement

The left half depicts the proportion of memory that can be used by 4 segments on the same node. The proportion of memory in the blank part is equal to 1-gp_resource_group_memory_limit, which is reserved for the operating system and Greenplum background processes. The sum of the gray parts is gp_resource_group_memory_limit , and then evenly distributed For each segment.
The right half depicts the memory allocation of different resource groups on a segment. There will be built-in admin_group and default_group on each segment . For the rg_sample group created by the user, the relationship between its shared memory and the memory reserved for each concurrent transaction is shown in gray. Because the concurrency defined by this resource group is 8, there are 8 reserved fixed memory segments. At the same time, the memory_shared_quota of this resource group is 50, which means that 50% of its total memory is shared memory, and the other 50% is reserved fixed memory, which is divided equally by 8 transactions.

Each running statement will exclusively occupy a part of the fixed memory, as shown in Transaction slot #1. After the statement is executed, this part of the fixed memory will be used first, and then the shared part in the group will be used, and then the global shared will be used finally part. Such multi-level resource configuration can achieve a good balance in terms of resource isolation and sharing, so as to meet the memory requirements of different usage scenarios.

Another configuration property memory_spill_ratio of resource group will affect the memory usage of a single statement. For the rg_sample group, this attribute is set to 30, which means that when the statement starts to execute, the amount of memory that can be used by the statement is calculated and 30% is allocated as the initial usage. When the execution plan of the statement is generated, memory will be allocated between different operators (such as table scan operators or sorting operators) according to this configuration. By default, only 100KB of memory needs to be allocated for ordinary operations. For operations that require memory such as hash association or sorting, the remaining memory will be allocated evenly. When performing hash association or sorting, if the memory usage exceeds the memory allocated for it, the operation of the operator starts to use external files to store some intermediate results, thereby alleviating the use of memory. This process is called spill. Usually, in this process, the memory usage of the operator will continue to rise, which is why the memory_spill_ratio should be set to 30 and adjustable, not always 100%. memory_spill_ratio is also a GUC that can be set in the session (that is, Global User Configuration, Chinese is the global user configuration, it is some configuration that can be set in the Master, Segment, or all nodes). After setting, it will take effect for subsequent statements in the same session, so that users can allow some statements to use more or less memory.

As can be seen from Figure 7-4, resource groups use different methods to achieve fine-grained memory resource management among nodes, segments, user groups, users, transaction statements, and different operators within statements .

Realize fine-grained CPU resource management among nodes, segments, user groups, and transaction statements. These designs fully weigh the efficiency of resource use and ensure the isolation of resource use. At the same time, it takes into account the certainty and uncertainty in resource use. It can provide good support for both transactional short queries and analytical long queries. It can meet resource usage requirements and constraint management under different loads.

Disk and network I/O are also shared resources that everyone is generally concerned about. In Greenplum, disk access and network access are driven synchronously by the execution engine , so when CPU resources are allocated between different groups, in theory, disk and network I/O have also received corresponding allocation restrictions. Of course, modern computers often have many cores. When different processes are running in parallel on these cores, the I/O resources they use may not be allocated in full proportion to the CPU. For this part of the possible needs, the Greenplum community continues to track user feedback, and it is possible to support the management of these two resources in future versions.

GreenPlum resource management (based on linux cgroups)

CPU and memory resources slightly

Isolation of system memory resources by resource groups

Guess you like