Kubernetes Road 1 - The Myth of Java Application Resource Limits

Abstract: With the maturity of container technology, more and more enterprise customers choose Docker and Kubernetes as the basis of their application platforms. However, in practice, there are still many specific problems. This article analyzes and solves a common problem about the Heap size setting in the process of using the container in Java applications.



As container technology matures, more and more enterprise customers choose Docker and Kubernetes as the basis of their application platforms. However, in practice, there are still many specific problems. This series of articles will record some experiences and best practices of the Alibaba Cloud Container Service team in supporting customers. We also welcome you to contact us via email and DingTalk to share your thoughts and problems.

Question

Some students reported that they set the resource limit of the container, but the Java application container will still be killed by the OOM Killer inexplicably during operation.

A very common reason behind this is that the resource limits of the container and the corresponding heap size of the JVM are not set correctly.
Let's take a tomcat application as an example, the example code and Kubernetes deployment files can be obtained from Github.

git clone https://github.com/denverdino/system-info
cd system-info`The

following is the definition description of a Kubernetes Pod:
  1. 1. The app in the Pod is an initialization container, which is responsible for copying a JSP application to Under the "webapps" directory of the tomcat container. Note: The JSP application index.jsp in the image is used to display JVM and system resource information.
      2. The tomcat container will keep running, and we limit the maximum memory usage of the container to 256MB memory.
apiVersion: v1
kind: Pod
metadata:
  name: test
spec:
  initContainers:
  - image: registry.cn-hangzhou.aliyuncs.com/denverdino/system-info
    name: app
    imagePullPolicy: IfNotPresent
    command:
      - "cp"
      - "-r"
      - "/system-info"
      - "/app"
    volumeMounts:
    - mountPath: /app
      name: app-volume
  containers:
  - image: tomcat:9-jre8
    name: tomcat
    imagePullPolicy: IfNotPresent
    volumeMounts:
    - mountPath: /usr/local/tomcat/webapps
      name: app-volume
    ports:
    - containerPort: 8080
    resources:
      requests:
        memory: "256Mi"
        cpu: "500m"
      limits:
        memory: "256Mi"
        cpu: "500m"
  volumes:
  - name: app-volume
    emptyDir: {}


We execute the following commands to deploy and test the application

$ kubectl create -f test.yaml
pod "test" created
$ kubectl get pods test
NAME READY STATUS RESTARTS AGE
test 1/1 Running 0 28s
$ kubectl exec test curl http://localhost:8080/system-info/
...


we can see To the system CPU/Memory and other information in HTML format, we can also use the html2text command to convert it into text format.
Note: This article is tested on a 2C 4G node, and the results of the test output will be different in different environments

$ kubectl exec test curl http://localhost:8080/system-info/ | html2text

Java version     Oracle Corporation 1.8.0_162
Operating system Linux 4.9.64
Server           Apache Tomcat/9.0.6
Memory           Used 29 of 57 MB, Max 878 MB
Physica Memory   3951 MB
CPU Cores        2
                                          **** Memory MXBean ****
Heap Memory Usage     init = 65011712(63488K) used = 19873704(19407K) committed
                      = 65536000(64000K) max = 921174016(899584K)
Non-Heap Memory Usage init = 2555904(2496K) used = 32944912(32172K) committed =
                      33882112(33088K) max = -1(-1K)


We can find that the system memory seen in the container is 3951MB, and the maximum JVM Heap Size is 878MB. Nani? ! Didn't we set the capacity of the container resource to 256MB? If so, when the application memory usage exceeds 256MB, the JVM has not performed GC on it, and the JVM process will be directly killed by the system OOM.
The root of the problem is:
  ● For the JVM, if the Heap Size is not set, it will set its own maximum heap size by default according to the memory size of the host environment.
  ● The Docker container uses CGroup to limit the resources used by the process, and the JVM in the container still uses the memory size and the number of CPU cores of the host environment to set the default settings, which leads to the wrong calculation of the JVM Heap.
Similarly, the default number of GC and JIT compilation threads in the JVM depends on the number of host CPU cores. If we run multiple Java applications on a node, even if we set the CPU limit, the application performance may still be affected due to the preemptive switching of GC threads.

After understanding the root cause of the problem, we can solve the problem very simply.

Solution Ideas

Open CGroup Resource Awareness

The Java community is also concerned about this problem, and supports automatic awareness of container resource limitations in JavaSE8u131+ and JDK9 https://blogs. oracle.com/java-platform-group/java-se-support-for-docker-cpu-and-memory-limits

Its usage is to add the following parameters

java -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap …


We add the environment variable "JAVA_OPTS" parameter to the tomcat container of the above example.
apiVersion: v1
kind: Pod
metadata:
  name: cgrouptest
spec: initContainers
  :
  - image: registry.cn-hangzhou.aliyuncs.com/denverdino/system-info
    name: app
    imagePullPolicy: IfNotPresent
    command:
      - "cp"
      - "-r"
      - "/system-info"
      - "/app"
    volumeMounts:
    - mountPath: /app
      name: app-volume
  containers:
  - image: tomcat:9-jre8
    name : tomcat
    imagePullPolicy: IfNotPresent
    env:
    - name: JAVA_OPTS
      value: "-XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap"
    volumeMounts:
    - mountPath: /usr/local/tomcat/webapps
      name: app-volume
    ports:
    - containerPort: 8080
    resources:
      requests:
        memory: "256Mi"
        cpu: "500m"
      limits:
        memory: "256Mi"
        cpu: "500m"
  volumes:
  - name: app-volume
    emptyDir: {}


我们部署一个新的Pod,并重复相应的测试

$ kubectl create -f cgroup_test.yaml
pod "cgrouptest" created

$ kubectl exec cgrouptest curl http://localhost:8080/system-info/ | html2txt
Java version Oracle Corporation 1.8.0_162
Operating system Linux 4.9.64
Server Apache Tomcat/9.0.6
Memory Used 23 of 44 MB, Max 112 MB
Physica Memory 3951 MB
CPU Cores 2
                                          **** Memory MXBean ****
Heap Memory Usage init = 8388608(8192K) used = 25280928(24688K) committed =
                      46661632(45568K) max = 117440512(114688K)
Non-Heap Memory Usage init = 2555904(2496K) used = 31970840(31221K) committed =
                      0 327 1(-1K)


We see that the maximum heap size of the JVM has become 112MB, which is very good, so that our application will not be easily OOM. Then the question comes again, why do we set the maximum memory limit of the container to 256MB, and the JVM only sets the maximum value of 112MB for Heap?

This involves the details of the memory management of the JVM. The memory consumption in the JVM includes two categories: Heap and Non-Heap; meta-information similar to Class, JIT-compiled code, thread stack, and memory space required by GC. All belong to Non-Heap memory, so the JVM will also reserve some memory for Non-Heap according to the resource limit of CGroup to ensure the stability of the system. (In the above example, we can see that Non Heap occupies nearly 32MB of memory after tomcat is started.)
In the latest JDK 10, further optimizations and enhancements have been made to the JVM running in the container. If

the new features of JDK 8/9 cannot be used, such as old applications that are still using JDK6, we can also use scripts inside the container to obtain the CGroup resource limit of the container and set the JVM Heap size. Docker 1.7 began to mount the container cgroup information into the container, so the application can obtain the memory, CPU and other settings from files such as /sys/fs/cgroup/memory/memory.limit_in_bytes, and the configuration is correct according to the Cgroup in the application startup command of the container The resource settings of -Xmx, -XX:ParallelGCThreads and other parameters are already in the article https://yq.aliyun.com/articles/18037 There are corresponding examples and codes, this article will not repeat them. Summary This article analyzes the use of Java applications in the use of containers A common Heap setup problem. Containers are different from virtual machines, and their resource constraints are implemented through CGroups. However, if the internal process of the container does not perceive the limitations of CGroup, memory and CPU allocation may lead to resource conflicts and problems. We can take advantage of the new features of the JVM and custom scripts to set resource limits correctly. This can solve most resource constraints.














Another problem with resource limitations in container applications is that some older monitoring tools or system commands such as free/top will still obtain the host's CPU and memory when running in a container, which leads to some monitoring tools in the container. The resource consumption cannot be calculated properly when running in medium. A common practice in the community is to use lxcfs to keep the container’s behavior in resource visibility consistent with that of virtual machines. Subsequent articles will introduce its usage on Kubernetes.

Alibaba Cloud Kubernetes Service is the first batch of Kubernetes conformance certification in the world, which simplifies Kubernetes cluster life cycle management, and has built-in integration with Alibaba Cloud products, which will further simplify the developer experience of Kubernetes and help users focus on cloud application value innovation.

Original link: https://yq.aliyun.com/articles/562440?spm=a2c41.11181499.0.0

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326101749&siteId=291194637