Kubernetes monitoring practice

A, Kubernetes Introduction

Kubernetes (K8s) is an open source platform that simplifies application management, application deployment and application of the extended part of the manual processes, allowing users more flexibility in deploying cloud management application.

As a scalable, fault-tolerant platform, K8s can be deployed in almost all infrastructure, perfectly compatible with Google Cloud, MS Azure and so on AWS public cloud, private cloud, hybrid cloud, server clustering, data center. Kubernetes biggest bright spot is the automatic deployment and support vessels are automatically copied. This is also a large number of micro-cloud infrastructure services on the reasons K8s deployment.

Two, K8s Origin

K8s originally developed by Google engineers developed in 2014 on the line and open source, currently by community contributors from Microsoft, Red Hat, IBM and other software giants Docker maintenance upgrades.

Google not only open the company's entire infrastructure operating mode in a container, the container is also actively developing Linux technology, supporting all Google cloud services. K8s load operation is based on 15 years of experience in production cloud platform designed for handling thousands of containers. Google Weekly deploy more than 20 billion container. Before K8s on line, Google mainly container deployment through internal development platform Borg. Borg is a large internal cluster management system, running numerous applications and cluster task, many years of experience in the development and laid a foundation K8s technology.

Three, K8s works

It is a coordinated system of application container division on different machines on K8s nature, intended to help developers through K8s predictability, scalability and high-availability applications and container management throughout the life cycle of services, through higher level of abstraction, will unify multiple machines into a machine. This is essential for the operation of large-scale environment is.

K8s not only to optimize the image and the ability to run Docker container management capabilities, but also compatible rkt and other containers CoreOS engine.

Kubernetes monitoring practice

Above architecture diagram shows K8s works. FIG Master contains a set of components, including many pod. Pod modeling for the "logical hosts" for specific applications. Each Pod application contains one or more containers, storage resources, and a unique IP network operation details container. Pod atom is the smallest unit of the container. Theoretically, Pod application contains one or more highly coupled. Ideally, each container contains a Pod.

每个进程包含一个API server、一个scheduler和多个controller。

API server负责暴露K8s API、处理REST操作及后续更新。Scheduler负责将未部署的Pod匹配到合适虚拟机或物理机上。如果没有合适的机器,则Pod将处于未分配状态,直至出现合适的节点。Master运行集群级别的其他功能,通过嵌入式controller完成创建端点、发现节点、复制控制等操作。由于controller设计灵活且可扩展,Kube管理员可自行创建controller。Kube通过API server监控K8s集群的共享状态,并对集群状态进行调整,确保当前状态与理想状态一致。

K8s提供支持容器化应用统一自动化、控制和升级的各项功能,包括企业级容器部署、内置服务发现、自动扩展、持久化存储、高可用、集群互通和资源装箱等。

依赖这些功能,K8s实现了对单体应用、批处理应用及高度分布式微服务应用等不同应用架构的支持。

四、K8s监控实践中的挑战

2014年上线以来,K8s一直在变革容器技术,已经成为快速批量启动应用的关键工具。与此同时,挑战也随之而来,容器编排极其复杂。

K8s虽然已经极大地简化了容器实现和管理过程中从调度、配置到状态自动维护等一系列任务的操作难度,但监控方面依然存在挑战:

  • 相互通信的应用分布在不同的云服务平台上。K8s本质上是一个通用平台,用户可在平台上自由部署应用。企业一般会采用多云端解决方案,不仅能够减少对单一云服务平台的依赖,还能缩短故障停机时间,避免数据丢失。但这种部署方式也给实时数据抓取和应用状态监控带来了挑战。
  • 在动态基础设施上不断迁移应用。由于应用处于频繁迁移状态,因此很难做到所有平台和协议之间的完全可见,这就会隐藏系统的瓶颈问题。很多公司的基础设施上都运行着多个应用,因此这种问题是不可避免的。如果没有稳健的监控系统,用户便无法发现应用的潜在问题。
  • 监控对象数量繁多且极为复杂:K8s由很多组件构成,非常复杂,因此要监控K8s,就必须监控下列所有对象:

    • 集群容量和资源利用情况:(a)Node:确保K8s所有节点的状态,监控CPU、内存和硬盘的使用情况;(b)Pod:确保所有已实现Pod状态正常;(c)Container:根据配置的消耗上限监控CPU和内存的消耗情况。
      应用:根据请求率、吞吐量、错误率监控集群中应用的性能和可用性。
    • 终端用户体验:监控移动应用和浏览器性能,优化加载时间和可用性,提高客户满意度。
    • 配套基础设施:前文提到,K8s的运行平台也非常重要。
  • 操作细节:K8s的所有核心组件(即kubelet、Kube controller manager和Kube scheduler)都有很多标记。这些标记决定了集群的操作和运行方式,其初始默认值一般较小,适用于规模较小的集群。随着集群规模的扩大,用户需要及时对集群进行调整,并监控K8s的标签和注释等细节。

但监控工具从K8s抓取大量数据时会影响集群性能甚至导致集群故障,因此需要确定监控基线。需要诊断故障时,可适当调高基线值。

调高基线值的同时要部署更多master和node,提高可用性。涉及大规模部署时,可单独部署专门存储K8s数据的集群,这样能够保证在创建监控事件、检索监控数据时,主要实例的性能不受影响。

五、从源头上监控K8s

和很多容器编排平台一样,K8s具备基本的服务器监控工具。用户可对这些工具进行适当调整,以便更好地监控K8s的运行情况。主要工具如下:

  • K8s仪表盘:插件工具,展示每个K8s集群上的资源利用情况,也是实现资源和环境管理与交互的主要工具。
  • 容器探针:容器健康状态诊断工具。
  • Kubelet:每个Node上都运行着Kubelet,监控容器的运行情况。Kubelet也是Master与各个Node通信的渠道。Kubelet能够直接暴露cAdvisor中与容器使用相关的个性化指标数据。
  • cAdvisor:开源的单节点agent,负责监控容器资源使用情况与性能,采集机器上所有容器的内存、网络使用情况、文件系统和CPU等数据。
  • cAdvisor easy to use, but there are drawbacks: First, the only basis for monitoring the utilization of resources to analyze the actual performance of the application; Second, do not have the ability to analyze long-term storage and trends.
  • Kube-state-metrics: polling Kubernetes API, and Kubernetes structured information into metrics.
  • Metrics server: Metrics server index timing acquisition of data from Kubelet Summary API, and the form is exposed to metric-api.

Monitor the overall process is as follows:

  • cAdvisor default installation on all cluster nodes, data collection container and index nodes.
  • Kubelet by kubelet API exposed to the index data.
  • Metrics Analyzing all available nodes, the node requesting to send the container and the usage data kubelet API, and the index data is exposed to polymerization by Kubernetes API.

Although the above-mentioned basic tool does not provide detailed application monitoring data, but can help users understand the situation and the underlying host K8s node.

In general, K8s cluster administrator focused on global monitoring, and application developers are focused on monitoring the situation at the application level. But the common aspirations of both are monitored while controlling input costs as much as possible a comprehensive system to collect data. Next week's article, we will introduce two viable monitoring program: Prometheus and Sensu. Two programs can provide a comprehensive system-level monitoring data to help developers track performance, locate the fault K8s key components, receive alerts.

Author: STEFAN THORPE

Translated Monitoring Kubernetes

Starting in UAVStack intelligent operation and maintenance

Guess you like

Origin blog.51cto.com/14159827/2439188
Recommended