From 0.25 to 1.0, the pit-filling practice of Mesos network and storage for small and medium-sized enterprises

After reading the Mesos container technology practice of large overseas enterprises, let's return to China. Today is the first bullet of the guest speech record of the Meetup of the Three Kingdoms in the Cloud Container. How do SMEs solve various problems in the use of Mesos? He Weiwei, Technical Director of Acttao, will tell you the answer——

What I share with you today are the specific problems of network and storage encountered in the Mesos practice of small and medium-sized enterprises.

Overview

First, let's introduce the practice of Acttao. Acttao now mainly runs two Mesos clusters, one for testing and one for production. The test environment is deployed in a KVM virtual machine, and the production environment is in Alibaba Cloud. There was an OpenStack test environment before, which was later removed. With the Mesos cluster, we introduced CI/CD in the development process. CI/CD requires developers to easily manage stateless Web services, a stateful service similar to MySQL, Redis, etc., and also requires Web services To be able to easily find database services, this is to solve these three problems.

To solve these problems, we need to implement one container-one IP in the Mesos container. For stateful containers, we need cross-host Volume services and service discovery.

Mesos container network

Talk about the scheme of the Mesos network. There are three stages according to time. The first stage is before Mesos 0.25. Before that, Docker itself has no container expansion. The second stage is between Mesos 0.25 and 1.0. The third stage is after the current version 1.0. Before Mesos 0.25, this solution was basically blank, and most of them had to manually run scripts, start a Docker container with an empty network, and then run some things such as creating network devices and configuring IP. At that time, there was a prototype Powerstrip tool, the principle of which was to replace the Docker API and make some extensions. Using this tool to add IP to the Mesos container would basically not change. Through the Mesos API, one container and one IP can be realized.

Between Mesos 0.25 and 1.0, when Docker introduced the network extension function, Docker containers had native network extension support. Typical third-party plugins are Weave, Calico, etc. We can directly realize that the container created by MarathonAPI has an IP.

After 1.0, Mesos natively supports the CNI network. Through Unified Container, whether it is a Mesos container, a Docker container, or an AppC container, it is easy to achieve one IP per container.

Before Mesos supported CNI, Acttao finally chose Weave in order to implement IP per container. Instead of using the extension for Docker containers, Weave Proxy is chosen, similar to the pre-0.25 solution, because it is easy to integrate. Weave's Proxy method has DNS service discovery, and the integration is relatively simple. Weave starts a Router, and then starts the Proxy. After starting, you can set the Docker socket on the Mesos slave and use the Weave Proxy socket.

At that time, the expansion method of Docker Libnetwork was not selected, because each Docker needed to be equipped with an external storage at that time, which is why Mr. Xu Chunming said that they do not rely on external storage in SwarmKit. Because the Docker tested at the time relied on external storage, the performance problems in the final test were more serious. At that time, they suggested to use etcd, and Acttao used Zookeeper at that time. When testing the network, and sometimes running docker network ls, Docker would get stuck, basically in a state of hanging.

Two problems have been bothering us when using Weave Proxy. One is the Weave network upgrade. When a new version is released, we may choose to use the new version, but the upgrade is more troublesome; the other problem is that the network isolation is not good. The Weave Proxy needs to be restarted during the upgrade. After restarting, Mesos thinks that the Docker Engine is down and will reschedule the task, but in fact the container on the host does not hang up. This problem has little effect on stateless services, which is equivalent to having multiple instances, but for database services, it will cause the original database service to run, and a new task is scheduled. In DNS automatic discovery, a domain name will be returned. Two IPs, one service is available and the other service is unavailable.

For this reason, before the upgrade, we stopped the Mesos Slave, stopped all the containers managed by Mesos, let the master re-schedule tasks, and then stopped Docker, and then installed the new version of Weave, and then installed Docker and slave starts. In the middle of the upgrade, if there is a database service in the host, the service will be unavailable for a period of time.

Weave's subnet-based network isolation is not very flexible. Considering multi-tenancy in Mesos, if a tenant's subnet is allocated too small, it may not be enough immediately. The subsequent process of expanding the subnet is very troublesome. If the subnet allocated at the beginning is very large, it will cause waste.

Problems encountered when upgrading network components can be solved using CNI. The problem of network isolation can be solved by Calico on the basis of CNI. Calico makes firewall rules based on iptable.

Configuring CNI in Mesos 1.0 is simple. Configure the directory of the network CNI configuration file and the directory of CNI plugins, and Mesos can enable CNI functions. The configuration of CNI for third parties is also very simple. This picture is the configuration of Weave, only one name and one type: weave-net . Calico is also not complicated, it is basically a configuration name, and supports the configuration required in the CNI network plug-in.

It is relatively simple to use Marathon to enable, match the IP address in the Json file of the APP, and match a network name, which is the name in the CNI before. Then assign some label labels, which are related to the firewall, and assign a Discover port, which is mainly used for service discovery. In this way, the two problems mentioned above are basically solved.

But Acttao is still using Weave CNI at present, and the reason for not choosing Calico solution is that its security policy must be manually configured now and cannot be automatically integrated with Mesos. If it is used internally, it is also possible to configure it yourself, but then consider writing a marathon-calico by yourself to automatically create network security policies based on the APP in Marathon.

Mesos container storage

The storage solution is also divided into three stages like the container just now. The intermediate stage supports the extension natively. There is a stage after the expansion of the container Volume, and after Mesos 1.0, the Docker storage plug-in is directly supported. Previously, Acttao built a GlusterFS cluster based on GlusterFS. The cluster was mounted on each S node, and the container was implemented by Docker directly hanging the directory on the host.

When Docker natively supports the Volume plugin, Acttao uses EMC's REX-Ray, which is similar to other Volume plugins and has the most complete functions among the plugins we have learned so far. It supports third-party storage, such as OpenStack and some commercial Storage hardware, including EMC.

After Mesos1.0, Docker Volume support service is provided natively, which is implemented by the dvdcli tool provided by EMC. At the beginning, EMC wanted to use this function to provide external storage for Mesos, but at that time, it was based on Mesos modules and could only support Mesos containers, but not Docker containers. So after Mesos1.0, this function is directly integrated into the Mesos core. After Mesos1.0, we also matched it to provide native support for Mesos. Its configuration is relatively simple, install dvdcli in the slave, and then set a Volume check_point for recovery, set system/linux and docker/volume on the isolation, and basically enable the function.

When using third-party external storage in Marathon, you need to turn on the external_volumes feature. When you define a volume in the json of the APP, set the type in the volume to external and the provider to dvdi, because currently only supports In this way, the name of the Docker Volume Plugin is used later, and it can basically be used.

The current latest version of Marathon cannot use absolute paths for external volumes, and this BUG416 is not expected to be resolved in the short term. Let's change the verification rules of the dvdi provider in it, and it can basically be used. The front-end validation rules are the containers of Mesos. The relative path cannot be used in the original setting, and you can change it.

 

https://www.v2ex.com/t/315881

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326992132&siteId=291194637