service in k8s

Overview

Pods in kubernetes are trivial, can be created, destroyed and not regenerated. ReplicationControllers can dynamically create & destroy pods (such as scaling up or scaling down or updating). Although pods have their own IPs, their IPs are not guaranteed to be stable, which will lead to a problem. If in a kubernetes cluster, some pods (backends) provide some functions for other pods (frontends), how? The frontend is guaranteed to find & link to the backends.

Introduce Services.

A kubernetes service is an abstraction that defines how to relate to a set of pods - sometimes called a "micro-service". A service uses the Label Selector to filter out a set of pods (the following will explain when a selector is not required).

For example, imagine an image processing backend with three nodes, all three of which can be replaced at any time - the frontend doesn't matter which one is linked. Even if the pods that make up the backend change, the frontend doesn't have to care which backend it connects to. Services decouple the link relationship between frontend and backend.

For the application of kubernetes itself, kubernetes provides a simple endpoint api. For the application of non-kubernetes itself, kubernetes provides a solution for servicet to link pods through a bridge that sets vip.

define a service

In kubernetes, services and pods are REST objects. Like other REST objects, a service is created via POST. For example, there is a set of pods, each pod exposes port 9376 and their label is "app=MyApp":

{
    "kind": "Service",
    "apiVersion": "v1",
    "metadata": {
        "name": "my-service"
    },
    "spec": {
        "selector": {
            "app": "MyApp"
        },
        "ports": [
            {
                "protocol": "TCP",
                "port": 80,
                "targetPort": 9376
            }
        ]
    }
}

The above json will do the following: create a service called "my-service", which maps the pods port 9376 with the label "app=MyApp", this service will be assigned an ip (cluster ip), the service uses This ip acts as a proxy, the selector of the service will always filter the pods, and put the result of the pods into an Endpoints that is also called "my-service".

Note that a service may introduce traffic to any targetPost, and the default targetPort field and port field are the same. Interestingly, targetPort can also be a string, which can be set to the name of the port mapped by a group of pods. In each pod, the real port corresponding to this name can be different. This brings a lot of flexibility for deploying & upgrading services, such as

kubernetes services supports TCP & UDP protocols, the default is tcp.

Services without selectors

A kubernetes service is usually an abstraction layer for linking pods, but services can also act on other types of backends. for example:

  • You want to use an external database cluster in the production environment and your own database in the test environment;
  • Want to point a service to a service in another namespace or to another cluster;
  • Hope to migrate non-kubernetes working code environment to kubernetes;

In any of the above scenarios, you can use a service without specifying a selector:

{
    "kind": "Service",
    "apiVersion": "v1",
    "metadata": {
        "name": "my-service"
    },
    "spec": {
        "ports": [
            {
                "protocol": "TCP",
                "port": 80,
                "targetPort": 9376
            }
        ]
    }
}

In this example, because no selector is used, no explicit Endpoint object is created. Therefore, you need to manually map the service to the corresponding endpoint:

{
    "kind": "Endpoints",
    "apiVersion": "v1",
    "metadata": {
        "name": "my-service"
    },
    "subsets": [
        {
            "addresses": [
                { "IP": "1.2.3.4" }
            ],
            "ports": [
                { "port": 80 }
            ]
        }
    ]
}

Whether there is a selector or not will not affect this service, its router points to this endpoint (in this case 1.2.3.4:80).

Virtual IPs and service proxies

Every node in kubernetes runs a kube-proxy. He maps a local port for each service, and any request to connect to this local port will go to a random pod after the backend. The field SessionAffinity in the service determines which pod to use the backend, and finally establishes some iptables rules locally, so that When accessing the cluster ip of the service and the corresponding port, the request can be mapped to the backend pod.

The end result is that any request to the service can be mapped to the correct pod without the client needing to care about kubernetes, service or other information about the pod.
 By default, the request will randomly choose a backend. Service.spec.sessionAffinity can be set to "ClientIP" (the default is "None"), so that pods can be selected based on client-ip to maintain a session relationship.

In kubernetes, the service is based on the three-layer (TCP/UDP over IP) architecture, and there is currently no service dedicated to the seven-layer (http).

Multi-Port Services

In many cases, a service needs to map multiple ports. Here is an example of this. Note that when using multi-port, you must set a name for each port, such as:

{
    "kind": "Service",
    "apiVersion": "v1",
    "metadata": {
        "name": "my-service"
    },
    "spec": {
        "selector": {
            "app": "MyApp"
        },
        "ports": [
            {
                "name": "http",
                "protocol": "TCP",
                "port": 80,
                "targetPort": 9376
            },
            {
                "name": "https",
                "protocol": "TCP",
                "port": 443,
                "targetPort": 9377
            }
        ]
    }
}

Choosing your own IP address

Users can specify their own cluster ip for the service, which is achieved through the field spec.clusterIP. The ip set by the user must be a valid ip and must meet the service_cluster_ip_range range. If the ip does not meet the above requirements, the apiserver will return 422.

Why not use round-robin DNS?

One question that pops up from time to time, why not just replace vip with a DNS poll? There are several reasons:

  • DNS libraries that have a long history don't pay much attention to DNS TTL and cache the results of name lookup;
  • Many applications do a name lookup only once and cache the results;
  • Even if the app and the dns library are well resolved, the client polling the dns over and over again will increase the complexity of management;

We do these things to avoid users doing what they want to do, but if so many users really ask, we will provide such an option.

Discovering services

For each running pod, the kubelet will add the existing service's global variables to it, supporting Docker links compatible variables as well as simple {SVCNAME}_SERVICE_HOST and {SVCNAME}_SERVICE_PORT variables.

For example, the service called "redis-master", which maps port 6379 to the outside world, has been assigned an ip, 10.0.0.11, then the following global variables will be generated:

REDIS_MASTER_SERVICE_HOST=10.0.0.11
REDIS_MASTER_SERVICE_PORT=6379
REDIS_MASTER_PORT=tcp://10.0.0.11:6379
REDIS_MASTER_PORT_6379_TCP=tcp://10.0.0.11:6379
REDIS_MASTER_PORT_6379_TCP_PROTO=tcp
REDIS_MASTER_PORT_6379_TCP_PORT=6379
REDIS_MASTER_PORT_6379_TCP_ADDR=10.0.0.11

This means an ordering dependency - for a service to be used by a pod, it must be built before the pod, otherwise these service environment variables will not be built into the pod. DNS does not have these limitations.

DNS

An optional extension (highly recommended) is the DNS server. The DNS server observes whether a new service is established through the kubernetes api server, and establishes a corresponding dns record for it. If the cluster has enabled DNS, then the pod can automatically do name resolution for the service.

举个栗子,有个叫做”my-service“的service,他对应的kubernetes namespace为”my-ns“,那么会有他对应的dns记录,叫做”my-service.my-ns“。那么在my-ns的namespace中的pod都可以对my-service做name解析来轻松找到这个service。在其他namespace中的pod解析”my-service.my-ns“来找到他。解析出来的结果是这个service对应的cluster ip。

Headless services

有时候你不想做负载均衡 或者 在意只有一个cluster ip。这时,你可以创建一个”headless“类型的service,将spec.clusterIP字段设置为”None“。对于这样的service,不会为他们分配一个ip,也不会在pod中创建其对应的全局变量。DNS则会为service 的name添加一系列的A记录,直接指向后端映射的pod。此外,kube proxy也不会处理这类service ,没有负载均衡也没有请求映射。endpoint controller则会依然创建对应的endpoint。

这个操作目的是为了用户想减少对kubernetes系统的依赖,比如想自己实现自动发现机制等等。Application可以通过api轻松的结合其他自动发现系统。

External services

对于你应用的某些部分(比如frontend),你可能希望将service开放到公网ip,kubernetes提供两种方式来实现,NodePort and LoadBalancer。

每个service都有个type字段,值可以有以下几种:

  • ClusterIP: 使用集群内的私有ip —— 这是默认值。
  • NodePort: 除了使用cluster ip外,也将service的port映射到每个node的一个指定内部port上,映射的每个node的内部port都一样。
  • LoadBalancer: 使用一个ClusterIP & NodePort,但是会向cloud provider申请映射到service本身的负载均衡。

注意:NodePort支持TCP/UDP,LoadBalancer只支持TCP。

Type = NodePort

如果将type字段设置为NodePort,kubernetes master将会为service的每个对外映射的port分配一个”本地port“,这个本地port作用在每个node上,且必须符合定义在配置文件中的port范围(为--service-node-port-range)。这个被分配的”本地port“定义在service配置中的spec.ports[*].nodePort字段,如果为这个字段设定了一个值,系统将会使用这个值作为分配的本地port 或者 提示你port不符合规范。

这样就方便了开发者使用自己的负载均衡方案。

Type = LoadBalancer

如果在一个cloud provider中部署使用service,将type地段设置为LoadBalancer将会使service使用人家提供的负载均衡。这样会异步的来创建service的负载均衡,在service配置的status.loadBalancer字段中,描述了所使用被提供负载均衡的详细信息,如:

{
    "kind": "Service",
    "apiVersion": "v1",
    "metadata": {
        "name": "my-service"
    },
    "spec": {
        "selector": {
            "app": "MyApp"
        },
        "ports": [
            {
                "protocol": "TCP",
                "port": 80,
                "targetPort": 9376,
                "nodePort": 30061
            }
        ],
        "clusterIP": "10.0.171.239",
        "type": "LoadBalancer"
    },
    "status": {
        "loadBalancer": {
            "ingress": [
                {
                    "ip": "146.148.47.155"
                }
            ]
        }
    }
}

这样外部的负载均衡方案将会直接作用在后端的pod上。

Shortcomings

通过iptables和用户控件映射可以很好的为中小型规模服务,但是并不适用于拥有数千个service的集群。详情请看” the original design proposal for portals“。

使用kube-proxy不太可能看到访问的源ip,这样使得某些类型防火墙实效。

LoadBalancers 只支持TCP.

type字段被设计成嵌套的结构,每一层都被增加到了前一层。很多云方案提供商支持的并不是很好(如,gce没有必要分配一个NodePort来使LoadBalancer正常工作,但是AWS需要),但是当前的API需要。

Future work

The gory details of virtual IPs

以上的信息应该足够用户来使用service。但是还是有许多东西值得大家来深入理解。 (懒得翻了,大家自己看吧,最后贴上最后一个图)

Avoiding collisions

IPs and VIPs

https://segmentfault.com/a/1190000002892825

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326613715&siteId=291194637