In linux system, sysctls interface allows the administrator to modify kernel parameters at runtime parameters exist in. /proc/sys/
Virtual file system parameters process involves a lot of sub-modules, such as:
Kernel (kernel) (common prefix
kernel.
)Network (networking) (common prefix
net.
)Virtual memory (virtual memory) (common prefix
vm.
)MDADM (common prefix
dev.
)
Enable non-secure sysctls
. Sysctls divided into secure and non-secure in addition to a reasonable division of the external namespace a safe sysctl must be between pod on the same node is isolated. This means setting safety sysctl a pod need to consider the following:
It must not affect other nodes on the same pod
We must not endanger the health of nodes
You must not get outside their own pod limited cpu or memory resources
Up to now, most sysctls under the name space are not considered safe are listed below kubernetes security support:
kernel.shm_rmid_forced
net.ipv4.ip_local_port_range
net.ipv4.tcp_syncookies
If in the future kubelete support better isolation mechanism, this will expand the list of supported security
All security is turned on by default sysctls
All non-secure sysctls turned off by default, the administrator must manually start the pod level. Pod contains non-secure sysctls will still be scheduled, but will fail to start.
Keep in mind the above warning, cluster administrator may in exceptional cases, such as for real time optimization, or high performance applications can be started by the corresponding sysctls.sysctl kubelet
start node level
In order to turn the need to manually start node sysctl If you want to start a plurality of nodes on the node you need to enter the appropriate settings, respectively.
kubelet --allowed-unsafe-sysctls \
'kernel.msg*,net.ipv4.route.min_pmtu' ...
For minikube, it can extra-config
be arranged
minikube start --extra-config="kubelet.allowed-unsafe-sysctls=kernel.msg*,net.ipv4.route.min_pmtu"...
sysctls only namespace can turn this way
Set podSysctls
A series of sysctls is divided in different namespace. This means that they can be individually set to the pod on the node. Sysctls only namespace through the pod securityContext
is set
Listed below are known to have a namespace. May change in future versions of linux kernel
kernel.shm*,
kernel.msg*,
kernel.sem,
fs.mqueue.*,
net.*.
No systls name space is called a node level sysctls. If you need to set them, you must manually set on the operating system on each node, or set by DaemonSet privileged
Security context used pod (SecurityContext) is provided with a namespace to sysctls. Security context for all containers have an effect within the pod.
The following example is provided to secure a security context by the pod sysctl kernel.shm_rmid_forced
and two non-secure sysctls net.ipv4.route.min_pmtu
and kernel.msgmax
in which the pod spec, secure and non-secure sysctl sysctl declared and no difference
In a production environment, only you understand sysctl functions to be set when it is set to avoid system instability.
apiVersion: v1
kind: Pod
metadata:
name: sysctl-example
spec:
securityContext:
sysctls:
- name: kernel.shm_rmid_forced
value: "0"
- name: net.ipv4.route.min_pmtu
value: "552"
- name: kernel.msgmax
value: "65536"
...
Since the non-security features non-secure sysctls, and the consequences of non-secure setting sysctls generated by your own, possible consequences include pod abnormal behavior, resource constraints or complete collapse nodes
pod Security Policy (PodSecurityPolicy)
You can set security policies in the pod forbiddenSysctls
(and) or allowedUnsafeSysctls
which sysctls to further control can be set. To a *
sysctl ending, such as kernel.*
matching all sysctl below it
forbiddenSysctls
And allowedUnsafeSysctls
are a series of pure string sysctl sysctl name or template (at *
the end). * Matches all sysctl
forbiddenSysctls
The exclusion of a series of sysctl. You can exclude a range of secure and non-secure sysctls. If you want to set any prohibited sysctls, you can use*
If you are in allowedUnsafeSysctls
the field to set up a non-secure sysctls, and does not appear in the forbiddenSysctls
field, use the pods in this pod can use this security policy (some) (sysctls). If you want to enable all non-security sysctls, you can set*
Warning, if you pass the pod security policy
allowedUnsafeSysctls
to add non-secure sysctl to the white list (which can execute), but if there is no node level set by sysctl--allowed-unsafe-sysctls
, pod will fail to start.
The following example allows to kernel.msg
be provided sysctls beginning, but prohibits providedkernel.shm_rmid_forced
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: sysctl-psp
spec:
allowedUnsafeSysctls:
- kernel.msg*
forbiddenSysctls:
- kernel.shm_rmid_forced
...