Differences between Docker Swarm and Apache Mesos

Retrieved from http://www.infoq.com/cn/articles/difference-between-swarm-Docker-and-mesos-apache

 

Docker Swarm is a cluster tool that is natively supported by the Docker community. By extending the Docker API, it tries to allow users to drive the entire cluster as if they were using the stand-alone Docker API; Mesos is a cluster resource management tool under the Apache Foundation, which abstracts the host's CPU by abstracting the , memory, storage and other computing resources to build an efficient, fault-tolerant and elastic distributed system.

Obviously, these two functions overlap, and there are many discussions on the Internet about the difference between Docker Swarm, Mesos and Kubernetes. As a heavy user of Mesos, I recently took time to play with Docker Swarm. Along the way, Docker Swarm gives me the feeling that it is particularly simple and flexible. Compared with Mesos, Docker Swarm is less intrusive to the cluster, so the resource consumption is also lower; secondly, I especially want to emphasize that the current It seems that while there is functional overlap between it and Mesos, the two focus on different things, so it doesn't make much sense to compare the two. Of course, this may change in the future, depending on the community's roadmap. Below I will compare Docker Swarm and Mesos from several perspectives.

configure

In terms of installation and configuration, Docker Swarm is much simpler than Mesos. Using Docker Swarm to build a cluster, in the simplest case, only 3 steps are required:

  1. swarm create Generate a cluster token by shell command  ;
  2. Use this token to broadcast the host you want to add to the cluster to the public service (Hosted Discovery with Docker Hub) discovered by the Docker Hub cluster, so we add the host to the cluster;
  3. Next  swarm manage , we can manage our cluster on any host connected to the Internet through the command and the corresponding token.

Further, for security and performance reasons, if we want to get rid of the public services discovered by the Docker Hub cluster, we only need to use any one of staticfile, consul, etcd or ZooKeeper in step 1, or even a static IP list for cluster discovery That's it.

Different from Docker Swarm, we must ensure that the management node (master, equivalent to the manager machine in step 3 above) of Mesos has been running normally, and then we can add agent/slave nodes to the cluster; in addition, when adding nodes to the cluster, we need to Configure basic parameters such as resources and containers; in the end, it is impossible to easily use cluster resources only by building a Mesos cluster. We need schedulers such as Marathon, Chronos, and Spark  to schedule resources before we can really use this set of things. Obviously, the configuration of Mesos is much more complicated than that of Docker Swarm. Of course, this is mainly due to the fact that Mesos needs to support multiple resource scheduling.

Ease of use

Since Docker Swarm provides a completely standard Docker API, users only need to understand the docker commands, and users can start using the Swarm cluster; for Mesos, we need to additionally understand the APIs of schedulers such as Marathon to truly publish Docker tasks to the cluster. superior. Of course, the Marathon scheduler also brings us benefits, such as docker container health check, failure restart mechanism, etc.

Architecture

We first refer to the architecture diagram in the blog " Weave Discovery and Docker Swarm " to analyze the architecture of Docker Swarm.

In the figure, vm1 and vm2 represent the computing nodes in the cluster. Each node of the cluster runs a Docker container of a swarm-agent. This swarm-agent is responsible for broadcasting the IP of the node to the backend discovered by the cluster, where backend we As mentioned in the configuration section above, it can be a public service discovered by a Docker Hub cluster or a consistency middleware such as etcd, consul, and ZooKeeper;

vmmaster represents the management node in the cluster, which runs the container of swarm-manager, where vmmaster is any machine that can access the backend, such as a notebook; swarm-manager obtains the cluster information by listening to the backend, and then accesses the dockers of vm1 and vm2 The REST API interface of the daemon program deploys containers, etc.

At the same time, we use   the Mesos architecture diagram in http://www.ericsson.com/research-blog/data-knowledge/mesos-meetup-hosted-ericsson-research/ to explain the architecture of Mesos. The Mesos Slave in the diagram represents the cluster The computing node in , corresponds to vm1/vm2 in the above figure; Mesos Master and MarathonScheduler together correspond to vmmaster in the above figure; Mesos Master actively offers resources to its scheduler, and then the scheduler decides whether to accept these resources.

至此,我们已经可以发现两者的不同,Mesos 是支持多种调度器的,Docker 容器型的任务,Hadoop、Spark 的计算任务等都可以运行在 Mesos 框架上,Mesos 强调的是资源混用的能力;而 Docker Swarm 只专注于 Docker 容器型任务。从而,依据不同的调度器,Mesos 的执行器(executor)是可配置的;而 Docker Swarm 只需要 Docker Daemon 一种执行器。

集群高可用/容错

Docker Swarm 与 Mesos 都可以通过一致性中间件构造高可用集群。Mesos 的 Master 节点一般通过 ZooKeeper 保证高可用,而 Docker Swarm 的 manager 节点可以通过 consul、etcd 或 ZooKeeper 中的任意一个来保证高可用。 但是从目前 Docker Swarm 的架构来看,Swarm manager 节点的高可用不是必需的,因为即使 manager 节点宕机了,Swarm 的原有服务也不会受到影响。我还有一种更极端的想法, Swarm 集群平时不需要 manager 节点,只有在需要 metrics 信息,发布新的应用,或者健康检查时再启动 manager 服务即可,这是因为 manager 节点目前的功能非常单一,像容器的健康检查,失败重启等功能还没有实现,文档中提到的资源管理,以及服务中断等机制也都没有详细的介绍,我估计应该还在开发中。

基本的健康检查

截止我写这篇文章时,Docker Swarm 没有提供对其部署的容器进行健康检查的功能,所以需要容器部署方来进行相应的容器的健康检查以及异常重启等;而 Mesos 的调度器 Marathon 是支持健康检查的,它可以每隔一段时间扫描一次应用的绑定端口,并在容忍3次或者几次失败后将应用重启,目前支持 HTTP、TCP协议,当然,这都需要应用提供 health 的接口。

可扩展性/可插拔

由于 Docker Swarm 使用标准的 Docker API,从而任何使用 Docker API 与 Docker 进行通讯的工具都可以无缝地和 Docker Swarm 协同工作,譬如与 docker-compose 结合实现多主机 scale 容器,这个与 Kubernetes 的 Pod 非常类似;与 Shipyard 集成等。但这对 Docker Swarm 来说也是一个缺点:你只能做 Docker API 规定的事情。如果 Docker API 不支持某个你要的功能,你就不能直接使用 Docker Swarm 来实现,你可能需要使用一些特别的技巧来实现(也可能完全不能实现)。

Mesos 的可扩展性首先在于它可以承接各种调度器,Spark、Hadoop、Kafka、Cassandra、Marathon、Chronos 等等都可以拿 Mesos 来做资源池;其次,Mesos 可以与 Mesos-DNS 结合来实现内部的服务发现/负载均衡。

另外,Docker Swarm 也可以与 Mesos 结合,在Docker Swarm 的 repo 里面有一个 docker-swarm-on-mesos 子模块 https://github.com/docker/swarm/tree/master/cluster/mesos 。Docker Swarm 可以借助它成为 Mesos 的调度器,使用 Mesos 资源池里面的资源。但是目前我个人还没有发现这种结合的价值,唯一能够想到的一点就是可以借此绕过 Mesos 来直接调度 docker 容器同时集群仍然支持资源混用,毕竟我们通过 Mesos 来直接操纵单个容器没有那么方便。

弹性

Mesos 与 Docker Swarm 都支持的向集群添加新的节点。

调度

Docker Swarm 对容器的调度已经相当丰富:

  • 通过参数 constraint 将容器发布到带有指定label的机器上。譬如将 MySQL 发布到storage==ssd 的机器上来保证数据库的IO性能;
  • 通过参数 affinity 将容器发布到已经运行着某容器的机器上,或者已经pull了某镜像的机器上;
  • 通过参数 volumes-fromlinknet 等将容器自动发布到其依赖的容器所在的机器上;
  • 通过参数 strategy 可以指定 spread,binpack和random 3种不同 ranking node 策略,其中 spread 策略会将容器尽量分散的调度到多个机器上来降低机器宕机带来的损失,反之binpack策略会将容器尽量归集到少数机器上来避免资源碎片化,random策略将会随机部署容器。

由于 Mesos 更加 generic,其在容器调度方面稍显欠缺,目前我们可以通过设置主机attribute来将容器调度到指定的机器上。

选择

当你尝试在 Docker Swarm 和 Mesos 之间做选择的时候,可以考虑以下几点。

  • 你要部署的集群只是运行 Docker 容器么?如果是,你可以考虑 Docker Swarm,否则如果你的集群资源需要混用,你最好尝试 Mesos。
  • 你要部署的集群是大型生产环境么?如果是,建议优先考虑 Mesos, 毕竟 Docker Swarm 还在开发中,而 Mesos 已经被国内外很多公司应用于生产环境上了;如果你只是想尝试 Docker 相关的东西,请考虑 Docker Swarm。
  • 你或你的团队有足够丰富的 Linux 和分布式经验么?如果没有,建议考虑 Docker Swarm,毕竟 Docker Swarm 的配置使用都更简单,更易于troubleshooting;而使用 Mesos 集群,你需要解决 docker 之外的很多问题,Mesos 将成为你额外的负担。

Finally, I want to emphasize again that Mesos is more about improving the resource utilization of the entire cluster from an economic point of view, and it is more suitable to compare it with YARN and Google Borg; while Docker Swarm focuses on Docker's cluster management, take it A comparison with Kubernetes might be more appropriate. Of course, in terms of the complexity of container cluster management, Mesos-based commercial products DCOS, such as Mesosphere in foreign countries and Shurenyun in China, are very simple and easy to use.

So in general, if you want to build a cluster production environment, from the perspective of stability and scalability, it is recommended to choose Mesos; if you only want to run Docker containers, from the perspective of ease of use, it is recommended to use Docker Swarm .

 

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326607082&siteId=291194637