Flexible, available, and highly scalable, EasyMR brings new Yarn queue management functions and visual configuration

YARN (Yet Another Resource Negotiator) is a resource scheduler in the Hadoop ecosystem , mainly used for resource management and job scheduling. YARN itself has a queue management function . By configuring and managing YARN resource queues, cluster resources are allocated to meet the needs of different applications and users. The introduction of YARN has brought huge benefits to the cluster in terms of utilization, unified resource management, and data sharing.

In a big data environment, enterprises often have multiple applications running simultaneously, and these applications may have different resource requirements and priorities. In order to reasonably allocate and manage resources and avoid resource competition and conflicts, resources need to be divided and scheduled.

This article will introduce various resource division and queue management methods, as well as the queue management function of EasyMR 's newly launched YARN, and how to provide users with a more efficient and convenient queue management experience through visual interface management .

Resource division method

In the field of big data, common resource division methods usually include the following:

Categorize by application type or characteristics

For example, you can place CPU-intensive applications in one queue and memory-intensive applications in another queue. In this way, you can ensure that different types of applications get the resources they need and avoid resource waste and imbalance.

Classify applications by priority

For some important tasks or emergency tasks, you can assign higher resource quotas and priorities to them to ensure that they can receive timely response and priority processing. For some minor tasks or low-priority tasks, lower resource quotas can be allocated to them to ensure the execution efficiency and priority of other important tasks.

Categorize by department or team needs

Different departments may have different requirements for resources. By assigning independent resource queues to different departments, you can ensure that each department can manage and allocate its own resources independently without interfering or affecting each other.

Although YARN itself has a queue management function, in actual use, YARN can only manage resource queues through configuration files, which is relatively cumbersome and requires certain technical knowledge.

file

CDH & HDP

The industry's preferred basic open source data platforms are CDH and HDP based on Hadoop distributed technology.

CDH(Cloudear Manager)

● Fair Share Strategy

CDH's Cloudear Manager adopts the Fair Share strategy . The weight and priority of each user or organization need to be determined in advance, and managers need to have a good understanding of the usage of the system. If these settings are unreasonable, some users or organizations may not have enough resources to perform tasks for a long time.

● Impact on scheduling efficiency

When multiple tasks or jobs are submitted at the same time, the Fair Share algorithm requires complex calculations, resulting in a decrease in scheduling efficiency.

HDP (Ambari)

● Managing complexity

Ambari uses visual drag and drop to adjust resources, which is easy to operate. However, the Yarn resource queue must ensure that the sum of queue resources at the same level is equal to 100%. Therefore, when adjusting a single queue resource, other queue resources must be adjusted to ensure that the sum of queue resources is equal to 100%, which results in high management complexity.

● Resource balancing

To ensure that the sum of queue resources at the same level is equal to 100%, when creating or deleting a queue, you need to adjust other queue resources to ensure queue resources.

EasyMR’s Yarn resource queue management function

Based on the above advantages and disadvantages, in order to improve the queue management experience and provide a more intuitive and detailed information display and a simple and clear operation interface for queue resource management, EasyMR has launched Yarn's queue management function for visual interface management, improving its flexibility and Availability and scalability.

file

EasyMR’s Yarn resource queue management features

● Capacity strategy

Based on the maximum and minimum resource capacity policies , the resource usage of the queue is limited. Users or departments can create their own exclusive resource queues according to their own business needs .

● Queue independent

When adjusting the queue resource size or creating or deleting a queue, there is no need to adjust the size of other queue resources. You only need to ensure that the resources of all sub-queues under the same parent queue are less than or equal to 100%.

● User docking

Supports docking with LDAP and OAuth user systems . By binding users and user groups to the Yarn resource queue-leaf queue, access control and resource allocation based on users and user groups are implemented to ensure resource security.

Leaf queue : refers to a queue that cannot be allocated to subqueues. It can be used directly to allocate resources to applications. In leaf queues, applications can be run directly or placed in the default allocation queue for scheduling.

Non-leaf queue : Sub-queues can be reallocated to further divide resources and manage resources. Submission of applications and tasks is not supported. For example, you can place CPU-intensive applications and memory-intensive applications in separate subqueues and assign them different resource quotas and priorities.

Parent queue : Usually a non-leaf queue, it contains multiple sub-queues and controls the resource allocation and priority of these sub-queues. For example, a parent queue can contain multiple sub-queues "memory", "cpu", etc. By setting different resource quotas and priorities for different sub-queues, the resources in the cluster can be better managed.

Sub-queues : are part of the parent queue. They inherit all properties of the parent queue and have their own resource quotas, priorities and other properties. Applications that can run in subqueues do not support subqueueing again.

How EasyMR creates the Yarn resource queue is introduced in detail in the previous article " How the big data computing engine EasyMR manages the Yarn resource queue simply and efficiently ", please click to read.

In the future, EasyMR will continue to optimize Yarn resource queue management, improve the security audit and queue monitoring of resource queues, and formulate better resource allocation strategies by matching resource queues with demand backgrounds to better meet the needs of enterprises in big data environments. Resource management and scheduling requirements.

"Dtstack Product White Paper": https://www.dtstack.com/resources/1004?src=szsm

"Data Governance Industry Practice White Paper" download address: https://www.dtstack.com/resources/1001?src=szsm Friends who want to know or consult more about Kangaroo Cloud big data products, industry solutions, and customer cases, please browse Kangaroo Cloud official website: https://www.dtstack.com/?src=szkyzg

At the same time, students who are interested in big data open source projects are welcome to join "Kangaroo Cloud Open Source Framework DingTalk Technology qun" to exchange the latest open source technology information, qun number: 30537511, project address: https://github.com/DTStack

Alibaba Cloud suffered a serious failure, affecting all products (has been restored). The Russian operating system Aurora OS 5.0, a new UI, was unveiled on Tumblr. Many Internet companies urgently recruited Hongmeng programmers . .NET 8 is officially GA, the latest LTS version UNIX time About to enter the 1.7 billion era (already entered) Xiaomi officially announced that Xiaomi Vela is fully open source, and the underlying kernel is .NET 8 on NuttX Linux. The independent size is reduced by 50%. FFmpeg 6.1 "Heaviside" is released. Microsoft launches a new "Windows App"
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/3869098/blog/10123127
Recommended