kubernetes design

API design principles

For cloud computing system, the system is actually in the Leading Status API system design, as mentioned earlier in this article, Kubernetes each cluster system to support a new feature, the introduction of a new technology, will be newly introduced corresponding API objects, support for the management operations functions, understand and grasp the API, like a bid to grab more than a Kubernetes system. Design Kubernetes system API has the following principles:

  1. All API should be declarative. As previously mentioned, declarative operation, with respect to the command operation, the operation is repeated for effect is stable, data loss or duplication of this distributed environment is very important for prone. In addition, declarative operation is more likely to be users, allows the system to hide implementation details to the user, hide implementation details at the same time, it retains the possibility of future continuous optimization system. In addition, declarative API, while implicitly all API objects are nouns nature, such as Service, Volume API are these terms, these terms describe a distributed object target users expect to receive.
  2. API objects are complementary to each other and can be combined. There is an incentive to try to achieve the required API objects when object-oriented design, i.e., "high cohesion, loose coupling", the concept of a suitable business-related decomposition, the decomposition of the target to improve the reusability. In fact, it Kubernetes this distributed system management platform, but also a business system, but its business is scheduling and management of container services.
  3. High-level API is based on the intention to operate design. How to design a good API, now how can design a good object-oriented approach where applications have in common, high-level design must be from the business, rather than a premature departure from the technology. Therefore, for the high-level API designed Kubernetes, must Kubernetes business is based on departure, it is the intention of the operating system scheduler management of container-based design.
  4. Low-level API to design under the control of high-level API. Designed to achieve a low-level API is to be used by high-level API, consider reducing redundancy, improving reusability purpose, low-level API design also needs-based, resist the temptation to try to achieve by the impact of technology.
  5. Try to avoid simple package, do not have the external API does not explicitly aware of the hidden internal mechanism. Simple package, actually no new features, but increased dependence on the package API. Hidden inside mechanism is very detrimental to the maintenance of the system design approach, such as StatefulSet and ReplicaSet, has always been two kinds Pod collection, then Kubernetes come to define them with a different API objects, and not only speak with a single ReplicaSet, by special internal the algorithm again distinguish this ReplicaSet is stateful or stateless.
  6. API operational complexity proportional to the number of objects. This is a system mainly from the performance point of view, to ensure that the entire system with the expansion of system scale, performance will not slow down quickly to not use, then the limit is the lowest operational complexity API can not exceed O (N), N is the number of objects, otherwise the system will not have a level of scalability.
  7. API object can not depend on the state of the network connection status. Because we all know, in a distributed environment, the network connection is disconnected things often happen, so make sure that API object state can cope with an unstable network, the state API target can not be dependent on the network connection status.
  8. Try to avoid operating mechanism relies on the global state, because in a distributed system to ensure that synchronization is very difficult global conditions.

Design principles of control mechanisms

  • Control logic should depend only on the current state. This is to ensure stable and reliable distributed systems, distributed systems for local errors often occur if the control logic depends only on the current state, then it is easy to a system failure occurs temporarily returned to normal, because you just the reset system to a steady state, the control logic can all be confident to know the system is running in the normal manner.
  • Assuming that any possible errors, and make fault-tolerant. Partial and temporary errors in a distributed system is a high probability event. Physical errors may come from a system failure, the external system failure can also result from the system's own error codes, codes rely on their own will not achieve the mistakes to ensure system stability is actually difficult to achieve, and therefore fault-tolerant design of any possible errors.
  • Try to avoid complex state machines, control logic do not rely on internal state can not be monitored. Because each subsystem distributed systems are not strictly synchronized, so if the two subsystem control logic, if an impact each other, the subsystem will certainly be able to visit each other to affect the state of the control logic through internal procedures, otherwise, equivalent to system control logic uncertain.
  • Assume that any operation may be denied any operation objects, even misinterpretation. Due to the complexity of distributed systems and the relative independence of the various subsystems, different subsystems frequently come from different teams, so you can not expect any action to be another subsystem in the correct manner, to ensure that when an error occurs, the operation level of error does not affect the stability of the system.
  • Each module can be restored automatically after an error. Due to the distributed system can not guarantee that the various modules of the system is always connected, so that each module should have the ability to self-repair, to ensure that no less than the other because the connection module and self collapse.
  • Each module can degrade gracefully services when necessary. The so-called degrade gracefully service, is required for robustness of the system, which requires division in the design and implementation of basic and advanced module clear function, to ensure the basic functions do not rely on advanced features, so at the same time ensures not appear because of the advanced features failure caused the collapse of the entire module. The system according to this concept implemented more easily and quickly add new advanced features, do not worry because the introduction of advanced features affect the original basic functions.

Kubernetes of core technical concepts and API objects

API object is Kubernetes cluster management operation unit. Kubernetes each cluster system to support a new feature, the introduction of a new technology, will be newly introduced corresponding API objects, support management operations of the function. E.g. Replica Set API object copy set corresponding to the RS.

Each object has 3 API attribute categories: metadata metadata, spec and norm state status. Metadata is used to identify the API objects, each object has at least three metadata: namespace, name and UID; except that there are a variety of tags and labels are used to identify matching object, such as a user env tag can be used to identify distinguish different service deployment environment, respectively env = dev, env = testing, env = production to identify the development, testing different services, production. Specification describes a state desired by the user over the distributed system clusters Kubernetes reached (Desired State), for example, by a user can set a desired number of copy controller Replication Controller Pod copy of 3; status described in the current system actually reach the state (Status ), such as the current actual number of copies Pod system is 2; then copy the current program logic controller is to start a new Pod automatic, strive to achieve the number of copies is three.

Kubernetes All configurations are set to go through the spec API object, which is ideal for the user to change the system state by configuring the system, which is one of Kubernetes important concept that all operations are declarative (Declarative) of rather than the imperative (imperative) of. Benefits declarative operating in a distributed system is stable, not afraid to lose, or run multiple times, such as setting the number of copies of the operating run 3 times also still a result, the number of copies and to add 1 operation is not declarative run multiple results are wrong.

Under

Kubernetes There are many technical concept, but a lot of the corresponding API object, the most important and most basic is the Pod. Pod is the smallest unit in Kubernetes running in the cluster deploy applications or services, it can support multiple containers. Pod's design philosophy is to support multiple containers shared network addresses and file system in a Pod, the service can be done by interprocess communication and file sharing such a simple and efficient way combination. Pod support for multiple containers K8 is the most basic design. For example, you run an operating system release depot, a container used to distribute software Nginx, another container designed to make synchronized from the source warehouse, mirroring the two containers is unlikely to be developed by a team, but they are a children work in order to provide a micro service; in this case, different teams to develop their own container to build the mirror, when the deployment of a micro combined into service to provide services.

Pod is Kubernetes cluster basis of all business types, it can be seen as running a small robot Kubernetes cluster, different types of business need different types of small robots to perform. The present Kubernetes service can be divided into long-servo type (long-running), a batch type (batch), type background support node (node-daemon), and has applied state (stateful application); corresponding to each small robot controller is to Deployment, Job, DaemonSet and StatefulSet, later in this article will be introduced one by one.

A copy of the controller (Replication Controller, RC)

RC API objects Kubernetes cluster is the earliest to ensure high availability of the Pod. By monitoring running Pod Pod copy of the cluster to ensure that the specified number of runs. Specifies the number can be more than one may be; less than a specified number, RC will start a new Pod copy run; more than the specified number, RC will kill the excess Pod copy. Even when the specified number of 1 through RC also running Pod Pod wiser than running, because RC can also play it highly available capacity to ensure that there is always a Pod running. RC is Kubernetes earlier concept of technology, servo type applies only to long-term type of business, such as the control of small robots provide highly available Web services.

Replica set (Replica Set, RS)

RS is a new generation of RC, offers the same high availability capability, differing primarily in the RS came from behind to be able to support a wider variety of matching patterns. Replica set object generally not used alone, but as the ideal Deployment parameters.

Deployment (Deployment)

Deployment represents the user to update Kubernetes cluster operation. Deployment is wider than the RS API object model of the application, you can create a new service, a new update service, it can be a rolling upgrade service. A rolling upgrade service, is actually creating a new RS, and then gradually increase the number of copies of the new RS Ideally, the decrease in the number of copies of the old RS to the combined operation 0; such a complex operation with a RS is not good description, so use a more generic Deployment to describe. Kubernetes development direction of future long-term management of all servo type of business, will be managed by Deployment.

Service (Service)

RC, RS and Deployment services just to ensure that the number of micro-Pod support services, but does not solve the problem of how to access these services. Pod is just an instance of a running service, it may stop on one node, start a new Pod with a new IP nodes in another, and therefore can not determine the IP and port number to provide services. To provide the service required service discovery and load-balancing capabilities steadily. Service discovery work done for the client to access the service and find the corresponding instance of back-end services. K8 in a cluster, the client service is the Service objects need to access. Each Service will be effective within the cluster corresponds to a virtual IP, internal IP access via a virtual cluster service. In Kubernetes micro-cluster load balancing service is implemented by the Kube-proxy. Kube-proxy is an internal Kubernetes cluster load balancer. It is a distributed proxy server on each node has a Kubernetes of; the design reflects the advantage of its flexibility, the more nodes need to access services provided Kube-proxy load balancing capabilities of the more, high availability nodes also will be increased. In contrast, we usually do a reverse proxy server to do load balancing, but also to further address high availability and load balancing reverse proxy issues.

Tasks (Job)

Job API objects Kubernetes is used to control the batch-type tasks. The main difference between Batch and long-term business servo business is to run batch operations from beginning to end, and long-term business servo run forever without the user's without stopping. Job management Pod according to your settings successfully completed the task automatically withdrawn. Successful completion of the mark according to the different strategies spec.completions different: there is a single task type Pod Pod complete success symbol; a set number of tasks to ensure the success of N-type tasks are successful; according to the work queue type tasks successfully applied globally recognized and logo success.

Background support service set (DaemonSet)

Servo and long-term core batch-type services in business applications, there may be some kind of business to run multiple nodes Pod, and no such Pod running on some nodes; and the core concerns back-supporting services node in the cluster Kubernetes (physical or virtual machines), each node has to ensure that such a run Pod. Node may be all cluster nodes may also be selected by nodeSelector specific node. Typical background support-type services include storage, logging and monitoring support services Kubernetes cluster running on each node.

Stateful service set (StatefulSet)

Kubernetes released 1.3 version of the Alpha version of the PetSet features in version 1.5 will be upgraded to function PetSet Beta version, and renamed StatefulSet, eventually became the official version 1.9 GA version. Original green in a cloud system applications, there are the following two sets of synonyms; a first group is stateless (Stateless), livestock (Cattle), unnamed (Nameless), disposable (Disposable); there is a second set state (Stateful), pet (pet), known (having name), not discarded (non-disposable). RC and RS is mainly controlled to provide stateless service, Pod names under their control are randomly set, a failure of the Pod was discarded, restart a new Pod in another place, the name has changed. Name and where to start is not important, but the total number of important Pod; and StatefulSet is used to control stateful service, the name of each of StatefulSet in the Pod are pre-determined and can not be changed. StatefulSet role in the Pod name, not the cause of humanity, "Spirited Away", but Pod associated with the corresponding state.

For RC and RS in the Pod, generally do not mount or mount the shared memory storage, preservation is shared by all state Pod, Pod are not like cattle (which seems indeed mean the loss of human characteristics); for StatefulSet in the Pod, each Pod to mount its own separate storage, if a fault occurs Pod, a start of the same name from other nodes Pod, Pod to be mounted on the original storage to continue to provide services to its state.

Suitable for StatefulSet business services include database MySQL and PostgreSQL, clustering management services ZooKeeper, etcd such as stateful services. StatefulSet Another typical application scenario as a more stable and reliable than conventional analog container virtual machine mechanism. The traditional virtual machine it is a kind of state of the pet, operation and maintenance personnel need to constantly maintain it, just when the vessel became popular, we use a container to simulate the virtual machine to use, all states are stored in containers, which have been shown to It is very insecure, unreliable. Use StatefulSet, Pod still be provided by different nodes shift to high availability, storage and high reliability can also be provided by a storage plug, to ensure the continuity of the storage associated state StatefulSet do is determined and the determined Pod.

Cluster Federal (Federation)

Kubernetes released a beta version of the Federation functions in version 1.3. In the cloud computing environment, the role of services ranging from near to far distances generally are: the same host (Host, Node), cross-host with the available area (Available Zone), across the available area same region (Region), with regional service business (cloud Service provider), cross-platform cloud. Kubernetes design positioning is a single cluster in the same geographical area, because the network performance to meet the same area of ​​scheduling and the calculation of storage connectivity requirements Kubernetes. The joint cluster service is to provide cross-service providers across the Region Kubernetes cluster service and design.

Each Kubernetes Federation has its own distributed storage, API Server and Controller Manager. Users can register of members of the Federation by the Federation of Kubernetes Cluster API Server. When the user through the Federation of API Server to create, change API object, Federation API Server API object would have created a corresponding registration in all of their sub-Kubernetes Cluster. In providing a service request service, Kubernetes Federation will first between their various sub-Cluster load balancing, and for sending to a specific Kubernetes Cluster service request will go in accordance with the same independence when Kubernetes Cluster service scheduling mode do load balancing internal Kubernetes Cluster. The load balancing between the Cluster is achieved through load balancing domain name services.

Federation V1 is designed to minimize the impact on existing Kubernetes Cluster mechanism, so that for each sub-Kubernetes cluster, does not need more of the outer layer have a Kubernetes Federation, that is, means that all the existing code and the mechanism is not Kubernetes because the Federation needs to function any change.

Currently under development Federation V2, while preserving existing Kubernetes API, and will develop new Federation dedicated API interface, the details can be here found.

Storage volumes (Volume)

Kubernetes cluster storage volume with a storage volume Docker's somewhat similar, but the role of Docker's storage volumes ranging from a container, and life cycle and scope of Kubernetes storage volume is a Pod. Pod storage volume each declared Pod shared by all of the container. Kubernetes supports very much the type of storage volumes, particularly in support of a variety of public cloud storage platforms, including AWS, Google and Azure cloud; support a variety of distributed storage include GlusterFS and Ceph; also supports mainframe easier to use local directory emptyDir , hostPath and NFS. Kubernetes also supports Persistent Volume Claim this logical storage PVC i.e., the use of such storage, so that the user may ignore the actual memory storage technology background (e.g. AWS, Google or GlusterFS and the Ceph), and stores the actual configuration related art by the storage administrator to configure Persistent Volume.

Persistent storage volume (Persistent Volume, PV), and persistent storage volumes statement (Persistent Volume Claim, PVC)

PV and PVC makes Kubernetes cluster has the ability to abstract logical storage, so that the logical configuration Pod's configuration can ignore the actual back-end storage technology, configured by the configuration of the work to PV, that cluster managers . PV and storage of this relationship PVC, with the relationship between Node and Pod calculations are very similar; PV and Node is a provider of resources, infrastructure varies according to changes in the cluster, the cluster administrator configured by Kubernetes; and PVC and Pod are resource users, varies according to changes in demand business services, has Kubernetes cluster administrator user to configure a service.

Node (Node)

Kubernetes cluster computing power provided by the Node, Node was originally called the service node Minion, later renamed Node. Kubernetes cluster Node just as well Mesos cluster Slave nodes are working all the host Pod is running, which can be a physical machine or a virtual machine. Whether it is a physical machine or virtual machine, unifying feature is the work of the host running on top of the container to run kubelet management node.

Key object (Secret)

Secret is used to preserve and pass these objects sensitive information passwords, keys, authentication credentials. The advantage of using Secret is written expressly avoided the sensitive information in the configuration file. Configuring and using services inevitably use to achieve a variety of sensitive information logon, authentication and other functions in Kubernetes cluster, such as username and password to access AWS storage. In order to avoid similar sensitive information in clear text written in all profiles need to use, this information may be stored in a Secret object, and in the configuration file refer to these sensitive information through Secret object. The benefits of this approach include: a clear intention to avoid duplication, reduce storm drain opportunity.

User accounts (User Account) and service accounts (Service Account)

As the name suggests, the user account for people to provide an account ID, and service accounts offer accounts identified as Pod and Kubernetes computer process running in the cluster. A difference between the user and service accounts that scope; the user account that corresponds to the identity of the person, irrespective of namespace identity and service, the user account is cross-namespace; the service account corresponds to the identity of the program in a run , with a particular namespace it is related.

Namespace (Namespace)

Namespace provides virtual isolation for Kubernetes clusters, cluster Kubernetes initial two namespaces, which are the default namespace and default namespace system kube-system, in addition, administrators can create a new namespace needs.

RBAC access authorization

Kubernetes role-based alpha version released in version 1.3 access control (Role-based Access Control, RBAC) licensing model. With respect to the attribute-based access control (Attribute-based Access Control, ABAC), RBAC role is mainly introduced (Role) and the role of binding (RoleBinding) of abstraction. In the ABAC, Kubernetes cluster access policy can only be directly associated with the user; and in the RBAC Access policies can be associated with a role, in particular with one or more users associated with the role. Obviously, RBAC, like other new features, like the introduction of new features each time, will introduce a new API objects, thus introducing a new concept of abstraction, and this new concept abstract cluster service will make it easier to manage and use to expand and reuse.

to sum up

From the system architecture, technical concepts and design Kubernetes, we can see that the core of the system Kubernetes two design concepts: one is fault-tolerant, one is easily scalable. The actual fault tolerance is the basis for stability and security of Kubernetes system, easy extension is to ensure a friendly Kubernetes to change, it can quickly increase the basic iterative new features.

Paxos consensus algorithm in a distributed system inventor computer scientist Leslie Lamport concept of a distributed system has two types of properties: Safety and security activity Liveness. Stable safety assurance system to ensure the system does not crash, business error does not occur, do not do bad things, strict constraints; activity allows the system to provide functionality, improve performance, increase ease of use, the user can let the system " time to see "do some good, is best effort. Design Kubernetes system coincided with the Lamport safety and activity ideas coincide, it is precisely because Kubernetes when introducing functional and technical, very well divided and active safety, we can make Kubernetes can be so fast iterative version, like the rapid introduction of RBAC, Federation and PetSet this new feature.

Guess you like

Origin www.cnblogs.com/peteremperor/p/12177093.html