Apache Mesos(4)-Resource scheduling, allocation and isolation mechanism of Mesos

Source:  https://andyyoung01.github.io/  or  http://andyyoung01.16mb.com/

In the previous article, we have a basic understanding of the Mesos architecture, and also installed and configured a high-availability cluster with 3 master nodes. While we haven't actually deployed Frameworks on a cluster to run specific tasks, we'll explore frameworks such as Marathon, Chronos, and Aurora in detail in a later article. Now let's take a deeper look at the resource scheduling, allocation and isolation mechanism of Mesos.

Mesos resource scheduling mechanism

Mesos implements a two-layer scheduling system: the Mesos slave reports its available resources to the master, and then the master issues resource invitations to the Framework's schedule through a pluggable Allocation module, which can accept the entire, partial, or Decline this resource offer. As shown below:

  1. The Mesos slave advertises its resources to the master: 8CPUs, 16GB RAM, 64GB free disk space. An asterisk (*) indicates that these resources belong to the default role. We'll get to the characters later.
  2. The allocation module of Mesos determines that the master should issue resource invitations to Framework A's schedule.
  3. Framework A's schedule accepts half of the resources of the resource invitation, leaving 4CPUs, 8GB of memory and 32GB of free disk space for other programs.
  4. Mesos' allocation module decides that the master should advertise all remaining unallocated resources to Framework B's schedule.

The above process has been repeated continuously, because every time, slaves will vacate the available resources due to the completion of the task.

Since most programs require a certain amount of CPUs, memory, disk, and network, Mesos predefines some resources.

  • Default Resources - By default, resource advertisements from Mesos slaves include:
    • cups
    • mem
    • disk
    • ports

When mesos-slave starts, the system's free resources determine the above values.

For other information about resource announcements and the Mesos system architecture, you can refer to the official documentation  http://mesos.apache.org/documentation/latest/architecture/  .

While the slave continuously advertises its available resources to the master, another part of Mesos, the pluggable allocation module, is responsible for deciding which framework should get a given resource offer.

Mesos' resource allocation mechanism

The Allocation module of the Mesos master determines the allocation of resources to a certain Framework. The pluggable nature of this module allows system engineers to implement their own allocation strategies and algorithms according to their organization's needs. The built-in allocation module uses the Dominant Resource Fairnes (DRF) algorithm, which can meet the needs of most Mesos users.

For more information on the default allocation module and algorithm, please refer to  http://mesos.apache.org/documentation/latest/allocation-module/  .

Mesos provides several ways by default to adjust resource allocation without replacing or rewriting the entire allocation module. These methods include  roles  ,  weights ,  and  resource reservations  .

Role

In a Mesos cluster, roles allow you to group Frameworks and resources into arbitrary groups.

To use roles, you need to add the --roles configuration option when starting the master. For example, the following configuration allows Frameworks to register with three roles commonly found in data centers:

--roles="dev,stage,prod"

之后,当Framework要在master上注册时,Framework可以指定为这些角色中的任意一种。这就允许许多团队或许多环境来共享一个大的Mesos集群,而不用创建多个小的集群。您也可以使用角色来确保特定类型的工作只运行在特定的一组机器上,例如负载均衡器或反向代理只运行在专用的边缘节点上。

权重

集群也可以给每个角色配置权重,来使不同的角色对于资源的分配有不同的优先权。当Mesos决定将资源先分配给哪个Framework时,它先将资源分配给最低于该权重的合理份额的Framework。例如,参考如下配置:

--weights="dev=10,stage=20,prod=30"

具有prod角色的Frameworks将得到比具有dev角色的Frameworks的三倍的资源。

资源保留

虽然权重可以确保某个角色得到比其它角色更多的资源,但Mesos也提供资源保留的方法。资源保留确保某个角色总是可以得到slave的一定的资源,但这样做可能会导致整个集群使用率的降低。

假如有一台机器有16CPUs,32GB内存,128GB磁盘,你想要确保这台机器有一半的资源总是可以提供给以prod角色注册的Frameworks,可以这样在这台Mesos slave机器上配置:

--resources="cpus(prod):8; mem(prod):16384; disk(prod):65536"

手动配置Mesos slave的资源和属性

前面几节我们知道资源可以分配给slave上的一定的角色,也可以配置资源保留。但如果您想创建定制的资源或想重写slave上资源的默认的值呢?和Mesos中的其它东西一样,资源通告的内容也是可配置的。当你想引入一些新的资源,或者想将CPUs和内存等资源硬编码到资源通告中,这个特性是非常有用的。

手动配置Mesos slave的资源

Mesos提供了三种数据类型来指定资源: 标量,范围,和集合 。

  • 标量:cpus资源给定值8;mem资源给定值16384。
  • 范围:端口资源给定从10000到20000的值。
  • 集合:磁盘资源给定值ssd1,ssd2,和ssd3。

可以使用与资源保留类似的方式指定slave资源:

--resources="cpu(*):4; mem(*):8192; disk(*):32768; ports(*):[40000-50000];cpu(prod):8; mem(prod):16384; disk(prod):65536"

手动配置Mesos slave的属性

属性可以是任意的键/值对,用来给master或Frameworks提供关于机器的一些数据。可以这样设置:

--attributes="datacenter:pdx1; rack:1-1; os:rhel7; pythons:python2,python3"

关于怎样配置Mesos slave资源和属性的更多信息,请参考 http://mesos.apache.org/documentation/latest/attributes-resources 。

Mesos的资源隔离机制

容器是榨取您的基础设施效率的一个非常理想的方法。容器比起虚拟机来更加轻量级,并且允许您在一个与其它负载隔离的环境中运行程序和代码。Mesos中的一个基础理念是使用容器隔离进程是使用计算资源最有效率的方式。

Mesos默认支持Linux cgroups和Docker,两种当下最流行的容器技术。通过在容器中运行executors和任务,Mesos slave允许多个Framework的executor一起运行,而不影响其它负载。这有点像在每个物理机上同时运行多个虚拟机,容器不用启动整个操作系统而比虚拟机轻量化许多。

Mesos的一个基础的组件称为containerizer。可以使用 –containerizers配置选项在Mesos slave上配置,目前包括两个containerizer:mesos和docker。mesos配置使用cgroups隔离和监视负载;docker配置调用Docker容器运行时,允许您在Mesos集群上启动已经编译好的镜像。

除了containerizer,Mesos也提供其它方法隔离资源。包括posix/cpu和posix/mem(默认),cgroups/cpu和cgroups/mem。怎样使用对资源的隔离和监视呢?是通过在slave上,给–isolation配置选项提供参数列表,来实现的。

本篇文章,我们更深入得了解了Mesos的资源调度、分配及隔离机制,下面我们使用Marathon来实际运行一个任务。

 

http://www.tuicool.com/articles/JzYvi23

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326878335&siteId=291194637