Kubernetes container cloud platform practice

Kubernetes is a Google open source container orchestration engine, which supports automated deployment, large-scale scalable application container management. With the rapid rise of cloud-native technology, and now Kubernetes in fact has become the standard application container platform, more and more enterprises of all ages, are increasingly being used in production are also applied.
Our container platform from the beginning of 2016, has gone through pre-research to explore, system construction and platform floor three stages.
Kubernetes container cloud platform practice
Here from the network, storage, cluster management and monitoring and operation and maintenance aspects Kubernetes to share the history of container cloud platform for building our next walk, we hope to give you some thoughts and inspiration.
A, kubernetes network
container network development up to now, has been the pattern of planes will. Hutch will actually refers to the Docker's CNM and Google, CoreOS, Kuberenetes led CNI. First be clear, CNM and not CNI networks, they are network specification and a network system, from the perspective of their research and development is a bunch of interface you are using Flannel underlying Ye Hao, Ye Hao with Calico, they do not care, CNM and CNI concern is the problem of network management.
Network needs survey found that business unit focuses on the following points: 1, the network and the physical network open container 2, the faster the better 3, 4 change as little as possible, the least possible risk points.
Network solution container can be divided into the protocol stack level, through the shape, form three isolated manner
Kubernetes container cloud platform practice
Stack level: second floor is better understood, more common in the engine room before the traditional or virtualized scenarios, is based on the ARP + MAC bridging learning, its biggest flaw is broadcast. Because the second floor of the broadcast will limit the magnitude of nodes; three (pure routing and forwarding), three layers of the protocol stack is generally based on BGP, autonomous learning route state of the entire room. Its biggest advantage is its IP penetration, that is to say as long as the IP-based network, and that this network can go through. Obviously, its size is very advantageous, and has good scalability of the order. But in the actual deployment process, because most corporate networks controlled. For example, some BGP corporate network is not based on security considerations to use developers or corporate network itself is not BGP, that in this case you are limited by the; stack floor plus three floors, it has the advantage to solve expand the scale of the problem Layer-2, but also solve the problem of pure three-limiting, especially under the cloud of a scene VPC, VPC can take advantage of Layer 3 forwarding capabilities across nodes.
Through Form:
This is related to the actual deployment environment. Through the form is divided into two: Underlay, Overlay.
Underlay: In a better controllable network scenario, we generally use Underlay. Such understanding may be popular, whether it is below bare metal or virtual machine, as long as the entire network control, network container can be passed through directly to, this is Underlay.
Overlay: Overlay more common in the cloud of the scene. Overlay network VPC Here is controlled, when there is not in the range of jurisdiction IP or the MAC VPC, VPC will not allow this IP / MAC crossing. When this happens, we can do using the Overlay mode.
Overlay network the physical network virtualization, resource pooling, is the key to cloud network integration. Overlay network and the technologies used in conjunction with SDN, the SDN Overlay network control plane controller as a controller, in this way easier to integrate a network computing component is ideal for the transition to the cloud platform service network.
Isolation method:
Isolation is usually divided into VLAN way and VXLAN two kinds:
VLAN: VLAN ones used in the engine room, but in fact there is a problem. It is the total number of tenants is limited. As we all know, VLAN has a number of limitations.
VXLAN: VXLAN is now a more mainstream way of isolation. Because of its size larger is better, and it is based on IP through better way.
We from the protocol level, through the form and manner of isolation kubernetes several common network components (calico, contiv, flannel, Openshift SDN, custom routing) to do an analysis in the traditional room of networks and cloud VPC network scenarios, even with FIG line to express the relationship between them before.
Kubernetes container cloud platform practice
First, whether traditional or cloud-based network computer room VPC network, we can see Overlay programs are generic, some more of it in the cloud scene could use, because it has good through sex.
In the figure, the solid line to the red traditional room network, the focus here described. Underlay + three-tier program, is a traditional room network very popular program, and it's very impressive performance, application scenarios relatively high side.
VPC point cloud of green dashed line network, Underlay + in a cloud of three layer network VPC scene, which can be used is limited. Restricted Use the name implies, can be used but not every vendor will let you use, because each different cloud vendors to define network to protect himself. For example, programs like Calico, its BGP in the AWS easy to do, but in the Azure is not allowed, because the VPC Azure itself is not allowed to control it by IP range.
Yellow solid line point cloud of VPC network, Overlay + common in two or three layers of the cloud scene. Overlay Here is controlled VPC network, control would be more convenient.
Of course, there are some problems of cloud at VPC scenario, as shown below.
Kubernetes container cloud platform practice
Then said network isolation between about multi-tenant issues
Kubernetes container cloud platform practice
K8s introduced version 1.3 network policy from mechanism, may be implemented inbound and outbound access policies between network policy through the POD.
Pod set network policies may be applied by conventional identification tag, and then use the tags to simulate the traditional network segment, may be identified by a particular front and rear ends of the pod "segments" tab. Policy control traffic between these segments, and even control traffic from an external source. But not all of the back-end network support strategies, such as flannel. Now many manufacturers to strengthen research in this area, there are many new solutions, not list them.
There is a cluster of border management Ingress
Kubernetes container cloud platform practice
Ingress is only appeared in kubernetes version 1.2, container applications default to provide services in the form of Service, but acts only on the Service within the cluster, by Ingress exposed to the Service can provide to clients outside the cluster service.
The following common Ingress Controller to do a comparison, the following table
Kubernetes container cloud platform practice
we see Nginx are better in terms of performance and features for the surface, as well as community activity, is also more practical.
Two, kubernetes storage
k8s initially used to manage stateless services, but migrated to k8s platform to manage storage resources has become a very important feature as more and more applications.
Kubernetes are stored for use mainly in the following aspects:
the service profile of the basic reading, cryptographic key management; storage state and services, data access and the like; among different services or applications to share data. Generally, there are several scenarios, as shown:
Kubernetes container cloud platform practice
Kubernete stored in the design of follow Kubernetes consistent philosophy that declarative (Declarative) architecture. Meanwhile, in order as much as possible compatible with a variety of storage platforms, Kubernetes in the form of in-tree plugin to dock different storage systems to meet user can use these plug-ins based on their business needs to provide container storage services. Compatible user FlexVolume and CSI custom plug-ins. Compared to Docker Volume, supported storage more feature-rich and diverse.
Resolve memory card Kubernete:
. 1, in Tree-plugin: storing code K8S tightly integrated, too coupling
2, FlexVolume: memory card is mounted on the host, the host need root authority
3, CSI specifications: storing code K8S completely decoupled (version 1.10 and above, using the CSI attacher using the 0.2.0 version)
Kubernetes container cloud platform practice
csi specification greatly facilitate the development of plug-in, maintenance and integration, with good prospects for development.
Kubernetes two storage resource management:
PersistentVolume (referred to as PV): storing a description of the added by an administrator, is a global resource containing a storage type, memory size and access mode. Its life cycle is independent of the Pod, for example when it is used to destroy the Pod has no effect on PV.
PersistentVolumeClaim (abbreviation PVC): Namespace is in the resource request for a description of PV. Storing request information includes size, access mode.
PV can be considered as available storage resources, PVC is the need for storage resources, PVC will automatically bound to a suitable PV Pod according to claim Pod use. Relationship between PV and PVC follows the life cycle shown in FIG.
Kubernetes container cloud platform practice
PV mode static and dynamic, static PV mode management NFS, FC, ISCSI, dynamic PV mode management glusterfs, Cinder, Ceph RBD, Vsphere , ScaleIO, AWS, Azure and so on. Static and requires the administrator to create and manage PV, and dynamic by the PV system to automatically generate and bind PVC.
The following could not be easier to add image management under kubernetes in the production of mirror will have many different versions of different applications to the Mirror management is more important aspects.
Kubernetes container cloud platform practice
Mirroring multi-tenant rights management:
1, different tenants should be isolated from each other mirror
2, the mirror different tenants have different permissions to read and write, read-only, upload, download privileges such as
3, mirror image library provides query, update and delete, etc.

对于跨地域多数据中心的镜像管理,镜像库的远程复制管理需要注意:
1、在多数据中心或跨地域多站点的环境下,为了提高多地区镜像的下载效率,至少需要两级镜像库的设置:总镜像库和子镜像库
2、镜像库之间的准实时增量同步
Kubernetes container cloud platform practice
三、 Kubernetes集群管理
在生产系统中,kubernetes多集群的管理主要涉及:
1、服务运维
2、集中配置
3、扩容升级
4、资源配额
首先说说多集群的调度管理
1、Kubernetes中的调度策略可以大致分为两种,一种是全局的调度策略,另一种是运行时调度策略
2、NODE的隔离与恢复;NODE的扩容;Pod动态扩容和缩放
3、亲和性可以实现就近部署,增强网络能力实现通信上的就近路由,减少网络的损耗。反亲和性主要是出于高可靠性考虑,尽量分散实例。
4、 微服务依赖,定义启动顺序
5、跨部门应用不混部
6、api网关以及GPU节点应用独占
Kubernetes container cloud platform practice
多集群管理中的应用弹性伸缩管理:
1、手工扩缩容:预先知道业务量的变化情况
2、基于CPU使用率的自动扩缩容:v1.1版引入控制器HPA,POD必须设置CPU资源使用率请求
3、基于自定义业务指标的自动扩缩容:v1.7版对HPA重新设计,增加了组件,被称为HPA v2
在实际应用中,HPA还有很多不完善的地方,很多厂商都用自己的监控体系来实现对业务指标的监控并实现自动扩容
Kubernetes多集群的调优:
主要有三个难点:
第一是如何分配资源,当用户选择多集群部署后,系统根据每个集群的资源用量,决定每个集群分配的容器数量,并且保证每个集群至少有一个容器。集群自动伸缩时,也会按照此比例创建和回收容器。
第二是故障迁移,集群控制器主要是为了解决多集群的自动伸缩和集群故障时的容器迁移,控制器定时检测集群的多个节点,如果多次失败后将触发集群容器迁移的操作,保障服务可靠运行。
第三是网络和存储的互连,由于跨机房的网络需要互连,我们采用vxlan的网络方案实现,存储也是通过专线互连。容器的镜像仓库采用Harbor,多集群之间设置同步策略,并且在每个集群都设置各自的域名解析,分别解析到不同的镜像仓库。
Kubernetes container cloud platform practice
接下来说说K8S集群的Master节点高可用实现,我们知道Kubernetes集群的核心是其master node,但目前默认情况下master node只有一个,一旦master node出现问题,Kubernetes集群将陷入“瘫痪”,对集群的管理、Pod的调度等均将无法实施。所以后面出现了一主多从的架构,包括master node、etcd等都可设计高可用的架构。
Kubernetes container cloud platform practice
还有了解下Federation 集群联邦架构
在云计算环境中,服务的作用距离范围从近到远一般可以有:同主机(Host,Node)、跨主机同可用区(Available Zone)、跨可用区同地区(Region)、跨地区同服务商(Cloud Service Provider)、跨云平台。K8s的设计定位是单一集群在同一个地域内,因为同一个地区的网络性能才能满足K8s的调度和计算存储连接要求。而集群联邦(Federation)就是为提供跨Region跨服务商K8s集群服务而设计的,实现业务高可用。
Federation 在1.3版引入,集群联邦federation/v1beta1 API扩展基于DNS服务发现的功能。利用DNS,让POD可以跨集群、透明的解析服务。
1.6版支持级联删除联邦资源,1.8版宣称支持5000节点集群,集群联邦V2
Kubernetes container cloud platform practice
目前存在的问题:
1、网络带宽和成本的增加
2、削弱了多集群之间的隔离性
3、成熟度不足,在生产中还没有正式的应用
四、kubernetes的监控与运维
对于一个监控系统而言,常见的监控维度包括:资源监控和应用监控。资源监控是指节点、应用的资源使用情况,在容器场景中就延伸为节点的资源利用率、集群的资源利用率、Pod的资源利用率等。应用监控指的是应用内部指标的监控,例如我们会将应用在线人数进行实时统计,并通过端口进行暴露来实现应用业务级别的监控与告警。那么在Kubernetes中,监控对象会细化为哪些实体呢?
系统组件
kubernetes集群中内置的组件,包括apiserver、controller-manager、etcd等等。
静态资源实体
主要指节点的资源状态、内核事件等等
动态资源实体
主要指Kubernetes中抽象工作负载的实体,例如Deployment、DaemonSet、Pod等等。
自定义应用
主要指需要应用内部需要定制化的监控数据以及监控指标。
不同容器云监控方案的对比:
Kubernetes container cloud platform practice
关于Prometheus监控:
主要注意两点:
 查询api的封装
 配置文件的下发
有了prometheus这个强大的监控开源系统之后,我们所需要投入的工作就是查询api的封装和配置文件的下发。查询api的封装没什么好说的,无非就是前端调用我们自己的server,我们的server呢通过http协议去调用prometheus的api接口查询到原始数据,然后进行组装,最后返回给前端。 配置文件的话包含三部分吧,警报的定义,alertmanager的配置,以及prometheus的配置,这里也不好展开讲,有兴趣的可以去官网看看。当然也可以使用Prometheus+Grafana来搭建监控系统,这样可视化会更丰富些,展现也比较快。
运维的思考---开发与运维的一体化
Kubernetes container cloud platform practice
运维的思考---高可用问题
• Ocp平台:
1、负载均衡Router高可用集群: 2个节点
2、EFK高可用集群: 3个ES节点+n个F节点
3、镜像仓库高可用集群: 2个镜像仓库
• 微服务架构:
1、注册中心高可用集群(Eureka): 3个
2、配置中心高可用集群: 3个
3、网关高可用集群: 2个
4、关键微服务均是高可用集群
运维的思考---高并发问题
• Ocp平台:
1、对后端微服务(Pod)配置弹性扩容, K8的弹性伸缩扩容以及Docker容器的秒级启动可以支撑用户量的持续增长;
2、提前预留20%的资源, 当高并发情况发生时, 可以紧急扩充资源。
• 微服务架构:

  1. 调大关键链路微服务的熔断线程数: 提升主要业务的并发响应能力。
  2. 对非关键链路微服务通过熔断限流进行降级处理甚至关闭非关键链路的微服务。
  3. Fuse mechanism: to enhance fault tolerance in case of high concurrency scenarios cloud container to prevent cascading failures, and the avalanche effect micro-services, improve system availability.
    • Middleware:
    1, in addition to the cluster is being used, the increase in advance of cold standby cluster.
    2, when high concurrency scenarios imminent, may extend the emergency level.
    There is pressure measurement and performance optimization, the time is limited, do not start here to talk about it.
    Finally, the road to summarize cloud container
    1. operational level: because large firms have relatively high demands on the stability and continuity of the business, so the evolution path of the vessel must be operational from the edge to the core business, from simple applications to complex applications , specific to the business, first consider a container migration in Web front-end, back-end business last move. L
    2. technical level: Native Docker currently in service discovery, load balancing, container lifecycle management, inter-vessel network, storage and other aspects there are still many deficiencies, open source solutions and commercial versions of many third-party manufacturers to provide, each with Features, hard to compete. Regardless of the user product selection, reliability, flexibility two important factors need to be carefully considered.
    3. taking into account the cost-effectiveness: considering the balance between the container and the cost of paying the cost of future benefits.
    4. Consider the existing hardware load capacity of the container is not a panacea, some higher throughput requirements for concurrent operations run directly on the bare metal, by tuning the system to improve performance of the container may not be the most Good choice.
    5, continuously updated, always remind myself continuous learning, embrace change, so as to see the lack of platform, continuously iterative better products.
    In the production practice, only reinforce the foundation to continue to improve the eco-system-based construction products and container cloud platform, to control the future of a thousand miles!

Guess you like

Origin blog.51cto.com/xjsunjie/2441526