Kubernetes—Pod and container logs

log architecture

Application logs allow you to understand the internal health of the application. Logs are very useful for debugging problems and monitoring cluster activity. Most modern applications have some kind of logging mechanism. Likewise, container engines are designed to support logging. The simplest and most widely adopted form of logging for containerized applications is to write to the standard output and standard error streams.

However, native functionality provided by container engines or runtimes is often not sufficient for a complete logging solution.

For example, you might want to access application logs if something like a container crashes, a pod is evicted, or a node goes down.

In a cluster, logs should have independent storage and have a lifecycle independent of that of a node, pod, or container. This concept is called cluster-level logging.

The cluster-level log architecture requires an independent backend for storing, analyzing, and querying logs. Kubernetes does not provide a native storage solution for log data. Instead, there are many off-the-shelf logging solutions that can be integrated into Kubernetes. The following sections describe how logs are processed and stored on nodes.

Pod and container logs

Kubernetes captures per-container logs from running pods.

This example uses a manifest with a  Pod container that writes text to standard output once per second.


 debug/counter-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: counter
spec:
  containers:
  - name: count
    image: busybox:1.28
    args: [/bin/sh, -c,
            'i=0; while true; do echo "$i: $(date)"; i=$((i+1)); sleep 1; done']

To run this pod, execute the following command:

kubectl apply -f https://k8s.io/examples/debug/counter-pod.yaml

The output is:

pod/counter created

To get these logs, execute the following  kubectl logs commands:

kubectl logs counter

The output is similar to:

0: Fri Apr  1 11:42:23 UTC 2022
1: Fri Apr  1 11:42:24 UTC 2022
2: Fri Apr  1 11:42:25 UTC 2022

You can  kubectl logs --previous retrieve logs from a previous instance of the container using . If your Pod has multiple containers, specify  -c which container's logs to access by appending the container name to the command and using flags as follows:

kubectl logs counter -c count

Node's container log processing method

All output written to containerized applications  stdout and  stderr streams is processed and forwarded by the container runtime. Different container runtimes do this in different ways; however their integration with the kubelet is standardized to the  CRI log format .

By default, the kubelet persists a terminated container and its logs if the container is restarted. If a Pod is evicted from a node, all corresponding containers and their logs are also evicted.

The kubelet provides logs for client access through special features of the Kubernetes API. The usual way to access this log is to run  kubectl logs.

log rotation

Feature Status: Kubernetes v1.21 [stable]

You can configure the kubelet to automatically rotate logs.

If rotation is configured, the kubelet is responsible for rotating container logs and managing the log directory structure. The kubelet (using CRI) sends this information to the container runtime, which in turn writes the container logs to the given location.

You can configure two kubelet configuration options, containerLogMaxSize and containerLogMaxFiles, using the kubelet configuration file. These settings allow you to individually configure the maximum size of each log file and the maximum number of files allowed per container.

When running kubectl logs similar to the basic logging example, the kubelet on the node handles the request and reads directly from the log files. The kubelet will return the contents of this log file.

illustrate:

Only the contents of the most recent log file can be  kubectl logs obtained by .

For example, if a pod writes 40 MiB of logs, and the kubelet rotates the logs after 10 MiB, the run  kubectl logs will return at most 10 MiB of data.

System Component Logs

There are two types of system components: those that typically run in containers and those that directly participate in the running of containers. For example:

  • The kubelet and container runtime do not run inside containers. The kubelet runs your containers (grouped together by Pods)
  • The Kubernetes scheduler, controller manager, and API server run in pods (usually static pods. etcd components run in the control plane, most commonly also as static pods. If your cluster uses kube-proxy, it's usually Run as a DaemonSet.

log location

The way the kubelet and container runtimes write logs depends on the OS used by the node:

  • Linux
  • Windows

On Linux nodes using systemd, the kubelet and container runtimes write to journald by default. You will use journalctl to read systemd logs; for example: journalctl -u kubelet.

If systemd is not present, the kubelet and container runtime will write to .log files in the /var/log directory. If you want to write logs elsewhere, you can run the kubelet indirectly via the helper tool kube-log-runner, and use that to redirect the kubelet logs to a directory of your choice.

The kubelet always instructs your container runtime to write logs to a directory in /var/log/pods .

For more information on kube-log-runner, read Syslog.

For Kubernetes cluster components running in Pods, their logs will be written to  /var/log files in the directory, which is equivalent to bypassing the default logging mechanism (components will not write to systemd logs). You can use Kubernetes' storage mechanism to map persistent storage to the container running the component.

Likewise, you can use Kubernetes' storage mechanism to map persistent storage to the container running the component.

illustrate:

If you deploy Kubernetes cluster components (such as the scheduler) to log to volumes shared from parent nodes, you need to consider and ensure that these logs are rotated. Kubernetes does not manage this log rotation .

Your operating system may implement some log rotation automatically. For example, if you  /var/log share a directory into a static Pod of a component, node-level log rotation will treat the files in that directory as if they were written by a component outside of Kubernetes.

Some deployment tools will take log rotation into account and automate it; others will leave this up to you.

Cluster-level log architecture

While Kubernetes does not provide a native solution for cluster-level logging, there are several common approaches you can consider. Here are some options:

  • Use a node-level logging agent running on each node.
  • In the Pod of the application, there is a sidecar (Sidecar) container dedicated to logging.
  • Push logs directly from the application to the logging backend.

Use node-level logging agent

You can implement cluster-level logging by using a node-level logging agent on each node . A logging agent is a specialized tool for exposing logs or pushing logs to a backend. Typically, a logging agent is a container that has access to a directory containing log files for all application containers on that node.

Since the logging agent must be running on each node, it is recommended  DaemonSet to run the agent as .

Node-level logging creates only one agent on each node, and does not require modification of the application on the node.

Containers write data to standard output and standard error output, but the format is not uniform. Node-level agents collect these logs and forward them for aggregation.

Running a logging agent with a sidecar container

You can use sidecar containers in one of the following ways:

  • The sidecar container pipes application logs to its own standard output.
  • The sidecar container runs a logging agent that is configured to collect logs from the application container.

A sidecar container that streams data

With a sidecar container that writes to its own  stdout and  stderr transport streams, you can leverage the kubelet and logging agent on each node to handle logging. The sidecar container reads logs from a file, socket or journald. Each sidecar container   outputs logs to its own stdout sum  stream.stderr

This approach allows you to separate log streams from different parts of your application, some of which may lack   support for writing stdout or  logging. stderrThe logic behind redirecting logs is minimal, so it doesn't have much overhead. Also, since  stdout and  stderr are handled by the kubelet, you can use the built-in tools  kubectl logs.

For example, a container is running in a Pod, and the container writes to two different log files using two different formats. Here is the manifest for this Pod:


 admin/logging/two-files-counter-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: counter
spec:
  containers:
  - name: count
    image: busybox:1.28
    args:
    - /bin/sh
    - -c
    - >
      i=0;
      while true;
      do
        echo "$i: $(date)" >> /var/log/1.log;
        echo "$(date) INFO $i" >> /var/log/2.log;
        i=$((i+1));
        sleep 1;
      done      
    volumeMounts:
    - name: varlog
      mountPath: /var/log
  volumes:
  - name: varlog
    emptyDir: {}

It is not recommended to write log entries of different formats in the same log stream, even if you successfully redirect it to the container's  stdout stream. Instead, you can create two sidecar containers. Each sidecar container can track a specific log file from a shared volume and redirect the file content to its own  stdout stream.

Here is the manifest for the Pod running the two sidecar containers:


 admin/logging/two-files-counter-pod-streaming-sidecar.yaml

apiVersion: v1
kind: Pod
metadata:
  name: counter
spec:
  containers:
  - name: count
    image: busybox:1.28
    args:
    - /bin/sh
    - -c
    - >
      i=0;
      while true;
      do
        echo "$i: $(date)" >> /var/log/1.log;
        echo "$(date) INFO $i" >> /var/log/2.log;
        i=$((i+1));
        sleep 1;
      done      
    volumeMounts:
    - name: varlog
      mountPath: /var/log
  - name: count-log-1
    image: busybox:1.28
    args: [/bin/sh, -c, 'tail -n+1 -F /var/log/1.log']
    volumeMounts:
    - name: varlog
      mountPath: /var/log
  - name: count-log-2
    image: busybox:1.28
    args: [/bin/sh, -c, 'tail -n+1 -F /var/log/2.log']
    volumeMounts:
    - name: varlog
      mountPath: /var/log
  volumes:
  - name: varlog
    emptyDir: {}

Now when you run the pod, you can access each log stream individually by running:

kubectl logs counter count-log-1

The output is similar to:

0: Fri Apr  1 11:42:26 UTC 2022
1: Fri Apr  1 11:42:27 UTC 2022
2: Fri Apr  1 11:42:28 UTC 2022
...
kubectl logs counter count-log-2

The output is similar to:

Fri Apr  1 11:42:29 UTC 2022 INFO 0
Fri Apr  1 11:42:30 UTC 2022 INFO 0
Fri Apr  1 11:42:31 UTC 2022 INFO 0
...

If you have a node-level agent installed in your cluster, the above log streams are picked up automatically by the agent without any further configuration. If you wish, you can configure the agent to parse log lines based on the source container.

Even for pods with low CPU and memory usage (a few millicores for CPU and a few megabytes for memory), writing the logs to a file streams these logs to potentially double the amount of storage required by the  stdout node . If you have an application that writes to a specific file, it is recommended to  /dev/stdout target the file instead of the streaming sidecar container approach.

Sidecar containers can also be used to rotate log files that cannot be rotated by the application itself. An example of this approach is  logrotate a small container that runs periodically. However, using  stdout and  stderr is more straightforward, leaving the rotation and retention strategy to the kubelet.

A node-level agent installed in the cluster automatically picks up these log streams without further configuration. If you wish, you can also configure an agent to parse the source container's log lines.

Note that even though CPU and memory usage are low (ordered by multiple CPU millicores metric or by megabytes of memory), writing logs to a file and then outputting to a  stdout stream will still exponentially increase disk usage. If your application writes logs to a single file, it's usually best to set that  /dev/stdout as the target path rather than using the fluent sidecar container approach.

If the application itself cannot rotate log files, it can be done through a sidecar container. An example of this approach is running a small container that rotates logs on a regular basis. stdout However, it is recommended to use and  directly  stderr, leaving the log rotation and retention policy to kubelet.

A sidecar container with logging proxy functionality

If a node-level logging agent is not flexible enough for your scenario, you can create a sidecar container with a separate logging agent, configuring the agent specifically to run with your application.

illustrate:

Using a logging proxy in a sidecar container can be a serious resource drain. Also, you cannot use  kubectl logs access logs because the logs are not managed by the kubelet.

Below are two configuration files that can be used to implement a sidecar container with a logging agent. The first file contains the ConfigMap used to configure fluentd.

admin/logging/fluentd-sidecar-config.yaml 

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentd-config
data:
  fluentd.conf: |
    <source>
      type tail
      format none
      path /var/log/1.log
      pos_file /var/log/1.log.pos
      tag count.format1
    </source>

    <source>
      type tail
      format none
      path /var/log/2.log
      pos_file /var/log/2.log.pos
      tag count.format2
    </source>

    <match **>
      type google_cloud
    </match>    

illustrate:

You can replace fluentd in this example configuration with other logging agents to read data from other sources inside the application container.

The second manifest describes a Pod running a fluentd sidecar container. The Pod mounts a volume from which flutend can pick its configuration data.


 admin/logging/two-files-counter-pod-agent-sidecar.yaml

apiVersion: v1
kind: Pod
metadata:
  name: counter
spec:
  containers:
  - name: count
    image: busybox:1.28
    args:
    - /bin/sh
    - -c
    - >
      i=0;
      while true;
      do
        echo "$i: $(date)" >> /var/log/1.log;
        echo "$(date) INFO $i" >> /var/log/2.log;
        i=$((i+1));
        sleep 1;
      done      
    volumeMounts:
    - name: varlog
      mountPath: /var/log
  - name: count-agent
    image: registry.k8s.io/fluentd-gcp:1.30
    env:
    - name: FLUENTD_ARGS
      value: -c /etc/fluentd-config/fluentd.conf
    volumeMounts:
    - name: varlog
      mountPath: /var/log
    - name: config-volume
      mountPath: /etc/fluentd-config
  volumes:
  - name: varlog
    emptyDir: {}
  - name: config-volume
    configMap:
      name: fluentd-config

Expose the log directory directly from the application

A cluster logging mechanism that directly exposes and pushes log data from individual applications is beyond the scope of Kubernetes.

Guess you like

Origin blog.csdn.net/leesinbad/article/details/131615059