k8s deploys elk+filebeat; springCloud integrates elk+filebeat+kafka+zipkin to track and aggregate multiple service log links to es

First, the purpose

Now in 2023, most javaweb architectures are springboot microservices, and a front-end function request may be completed by multiple different services in the background. For example, the user places an order function, js is forwarded to the backgroundGateway gateway service, then toAuthentication spring-sercurity service, then toBusiness order service, then topayment service, there will be delivery, customer labeling and other services in the follow-up.
Each service will start multiple instances for load balancing. In this way, if we want to see the completion process log of this function, we need to find the corresponding server ip, where the log file is, and determine which servers the specific load is forwarded to. up. If it is a production problem and you want to quickly locate the cause, you need a solution!
insert image description here

2. Involved technology stack

  1. Basic architecture: spring cloud(springBoot+service discovery+gateway+load fuse and other netflex). I am currently using springboot+eureka+gateway+springSercurity+openfeign+springConfig with business functions involving middleware redis, quartz, kafka, mysql, elasticsearch
  2. Display of log collection and processing: ELK
    • elasticsearch: massive json data storage ad hoc query
    • logstash: source data collection (tcp, file, redis, mq), format processing, push es storage
    • kibana: official es visual interactive curd tool
  3. Efficient and lightweight data collection tool: filebeat. Monitoring log files are obtained in real time and can be pushed to Kafka
  4. Kafka: Receive filebeat data for consumption by logstash
  5. Multi-service link tracking: sleuth-zipkin. No code intrusion. Simply put, tranceId and spanId are added to the printed log content. For exampleinsert image description here

3. Process

  1. js initiates an ajax request to the background gateway service

  2. The gateway service integrates maven <artifactId>spring-cloud-starter-zipkin</artifactId>dependencies, and will automatically add the tranceId field and the spanId field to the current request header. These two field values ​​are randomly generated. Where tranceId is equal to spanId when there are no these two fields in the header: for example, tranceId=123a, spanId=123a and added to the header. And when printing the log, this information will be printed out

  3. Afterwards, the gateway forwards it to business service A according to the request path. The zipkin of service A finds that there is tranceId information in the header, and only generates spanId, for example, tranceId=123a, spanId=231b and adds it to the header. And when printing the log, this information will be printed out.

  4. A service and rpc call B service. The zipkin of service B finds that there is tranceId information in the header, and only generates spanId, such as tranceId=123a, spanId=342h, and adds it to the header. And when printing the log, this information will be printed out.

  5. After the call is completed, the front-end response is returned.

  6. The above-mentioned log will be added to the log file of this server. Then filebeatthe tool listens to the new logs of each service, reads and pushes them tokafka

  7. New data is produced under the topic of the message queue, and logstashthe tool is configured in advance and starts to consume kafka, processes and saves the data elasticsearch. Here I am curious why not filebeatdirectly push es directly, or the log framework of springboot directly push es directly through appender?

    • Decoupling with filebeat does not affect springboot performance. and lightweight
    • Using kafka is to deal with a large amount of concurrent data and reduce the pressure on logstash
    • Finally, the purpose of pushing es through logstash is to process and format the source data, and then save it to es, which is more convenient for es to query logs
  8. After persisting es, by kibanaquerying the log, the query condition is tranceId=123athat the complete log can be queried.

4. Examples of integrated configuration filebeat, kafka, logstash

I divided it into two parts, some are jars deployed on the server, which I collect through filebeat; some are services deployed on the local notebook, and an appender is directly configured in logback.xml to output to kafka without filebeat.

  1. Java's pom introduces dependencies
<dependency>
    <groupId>net.logstash.logback</groupId>
    <artifactId>logstash-logback-encoder</artifactId>
    <version>6.6</version>
</dependency>
  1. logback-spring.xml file configuration output format (partial content)
        <appender name="fileUserLog" class="ch.qos.logback.core.rolling.RollingFileAppender">
            <!--   配置我们自己写的包的日志信息,目的是为了方便查看自己的类日志,此日志文件只有我们自己的的log         -->
            <File>${logdir}/user.${appname}.${serverport}.${KPHOSTNAME}.log</File>
            <!--滚动策略,按照时间滚动 TimeBasedRollingPolicy-->
            <rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
                <!--文件路径,定义了日志的切分方式——把每一天的日志归档到一个文件中,以防止日志填满整个磁盘空间-->
                <FileNamePattern>${logdir}/history/user.${appname}.${serverport}.%d{yyyy-MM-dd}.${KPHOSTNAME}.log</FileNamePattern>
                <!--只保留最近90天的日志-->
                <maxHistory>90</maxHistory>
                <!--用来指定日志文件的上限大小,那么到了这个值,就会删除旧的日志-->
                <totalSizeCap>1GB</totalSizeCap>
            </rollingPolicy>
            <!--日志输出编码格式化-->
            <encoder class="net.logstash.logback.encoder.LoggingEventCompositeJsonEncoder">
                <providers>
                    <pattern>
                        <pattern>
                            {
                            "dateTime": "%d{yyyy-MM-dd HH:mm:ss.SSS}",
                            "message": "%message",
                            "stackTrace": "%exception",
                            "level": "%level",
                            "traceId": "%X{X-B3-TraceId:-}",
                            "spanId": "%X{X-B3-SpanId:-}",
                            "service": "${appname}",
                            "thread": "%thread",
                            "class": "%logger.%method[%line]"
                            }
                        </pattern>
                    </pattern>
                    <timestamp>
                        <timeZone>GMT+8</timeZone>
                    </timestamp>
                </providers>
            </encoder>
        </appender>

logstash.conf

input {
    
    
  kafka{
    
    
    bootstrap_servers => "node101:30701"
    client_id => "logstash_kafka_consumer_id"
    group_id => "logstash_kafka_consumer_group"
    auto_offset_reset => "latest" 
    consumer_threads => 1
    decorate_events => true 
    topics => ["logstash"]
  }

}

filter {
    
    
  json {
    
    
    source => "message"
  }
} 
 
output{
    
    
    elasticsearch{
    
    
    hosts => ["node101:30600"]
    index => "logstash-%{
    
    +YYYY.MM.dd}"
    }
}

The content of the log file, I want to see the complete process of this request
insert image description here
insert image description here

filebeat.yml

filebeat.modules:
filebeat.prospectors:
- type: log
  enabled: true
  paths:
    - /spring-boot-logs/*/user.*.log
  #include_lines: ["^ERR", "^WARN"]
      # 适用于日志中每一条日志占据多行的情况,比如各种语言的报错信息调用栈
  multiline:
    pattern: '^[[:space:]]'
    negate: false
    match: after

processors:
  - drop_fields:
      fields: ["metadata", "prospector", "offset", "beat", "source","type"]

output.kafka:
  hosts: ["node101:30701"]
  topic: logstash
  codec.format:
    string: '%{[message]}' #输出的信息只包含message;例如java日志文件里面是什么样子输出到kafka也就是什么样子

k8s yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: elk
  name: ssx-elk-dm
  namespace: ssx
spec:
  replicas: 1 
  selector: #标签选择器,与上面的标签共同作用
    matchLabels: #选择包含标签app:mysql的资源
       app: elk
  template: #这是选择或创建的Pod的模板
    metadata: #Pod的元数据
      labels: #Pod的标签,上面的selector即选择包含标签app:mysql的Pod
        app: elk
    spec: #期望Pod实现的功能(即在pod中部署)
      hostAliases: #给pod添加hosts网络
      - ip: "192.168.0.101"
        hostnames:
        - "node101"
      - ip: "192.168.0.102"
        hostnames:
        - "node102"
      - ip: "192.168.0.103"
        hostnames:
        - "node103"
      containers: #生成container,与docker中的container是同一种
      - name: ssx-elasticsearch6-c
        image: 9d77v/elasticsearch:6.2.4 #配置阿里的镜像,直接pull即可
        ports:
        - containerPort: 9200  # 开启本容器的80端口可访问
        - containerPort: 9300  # 开启本容器的80端口可访问
        env:   #容器运行前需设置的环境变量列表
        - name: discovery.type  #环境变量名称
          value: "single-node" #环境变量的值 这是mysqlroot的密码 因为是纯数字,需要添加双引号 不然编译报错
        volumeMounts:
        - mountPath: /usr/share/elasticsearch/data   #这是mysql容器内保存数据的默认路径
          name: c-v-path-elasticsearch-data
        - mountPath: /usr/share/elasticsearch/logs   #这是mysql容器内保存数据的默认路径
          name: c-v-path-elasticsearch-logs
        - mountPath: /usr/share/elasticsearch/.cache   #这是mysql容器内保存数据的默认路径
          name: c-v-path-elasticsearch-cache
        - mountPath: /etc/localtime   #时间同步
          name: c-v-path-lt

      - name: ssx-kibana-c
        image: wangxiaopeng65/kibana:6.2.4 #配置阿里的镜像,直接pull即可
        ports:
        - containerPort: 5601  # 开启本容器的80端口可访问
        env:   #容器运行前需设置的环境变量列表
        - name: ELASTICSEARCH_URL  #环境变量名称
          value: "http://localhost:9200" #环境变量的值 这是mysqlroot的密码 因为是纯数字,需要添加双引号 不然编译报错
        volumeMounts:
        - mountPath: /usr/share/kibana/data2   #无用,我先看看那些挂载需要
          name: c-v-path-kibana
        - mountPath: /etc/localtime   #时间同步
          name: c-v-path-lt

      - name: ssx-logstash-c
        image: docker.elastic.co/logstash/logstash:6.2.4 #配置阿里的镜像,直接pull即可
        env:   #容器运行前需设置的环境变量列表
        - name: "xpack.monitoring.enabled"  #禁用登录验证
          value: "false" #环境变量的值 这是mysqlroot的密码 因为是纯数字,需要添加双引号 不然编译报错
        args: ["-f","/myconf/logstash.conf"]
        volumeMounts:
        - mountPath: /myconf   #配置
          name: c-v-path-logstash-conf
        - mountPath: /usr/share/logstash/data   #data
          name: c-v-path-logstash-data
        - mountPath: /etc/localtime   #时间同步
          name: c-v-path-lt

      - name: ssx-filebeat-c
        image: elastic/filebeat:6.2.4  #配置阿里的镜像,直接pull即可
        env:   #容器运行前需设置的环境变量列表
        volumeMounts:
        - mountPath: /usr/share/filebeat/filebeat.yml   #配置
          name: c-v-path-filebeat-conf
        - mountPath: /usr/share/filebeat/data  #配置
          name: c-v-path-filebeat-data
        - mountPath: /spring-boot-logs  #data
          name: c-v-path-filebeat-spring-logs
        - mountPath: /etc/localtime   #时间同步
          name: c-v-path-lt

      volumes:
      - name: c-v-path-elasticsearch-data #和上面保持一致 这是本地的文件路径,上面是容器内部的路径
        hostPath:
          path: /home/app/apps/k8s/for_docker_volume/elk/elasticsearch6/data  #此路径需要实现创建 注意要给此路径授权777权限 不然pod访问不到
      - name: c-v-path-elasticsearch-logs #和上面保持一致 这是本地的文件路径,上面是容器内部的路径
        hostPath:
          path: /home/app/apps/k8s/for_docker_volume/elk/elasticsearch6/logs  #此路径需要实现创建 注意要给此路径授权777权限 不然pod访问不到
      - name: c-v-path-elasticsearch-cache #和上面保持一致 这是本地的文件路径,上面是容器内部的路径
        hostPath:
          path: /home/app/apps/k8s/for_docker_volume/elk/elasticsearch6/.cache  #此路径需要实现创建 注意要给此路径授权777权限 不然pod访问不到
      - name: c-v-path-kibana #和上面保持一致 这是本地的文件路径,上面是容器内部的路径
        hostPath:
          path: /home/app/apps/k8s/for_docker_volume/elk/kibana  #此路径需要实现创建 注意要给此路径授权777权限 不然pod访问不到
      - name: c-v-path-logstash-conf
        hostPath:
          path: /home/app/apps/k8s/for_docker_volume/elk/logstash/myconf
      - name: c-v-path-logstash-data
        hostPath:
          path: /home/app/apps/k8s/for_docker_volume/elk/logstash/data
      - name: c-v-path-lt
        hostPath:
          path: /etc/localtime   #时间同步
      - name: c-v-path-filebeat-conf
        hostPath:
          path: /home/app/apps/k8s/for_docker_volume/elk/filebeat/myconf/filebeat.yml
      - name: c-v-path-filebeat-data
        hostPath:
          path: /home/app/apps/k8s/for_docker_volume/elk/filebeat/data
      - name: c-v-path-filebeat-spring-logs
        hostPath:
          path: /home/ssx/appdata/ssx-log/docker-log

      nodeSelector: #把此pod部署到指定的node标签上
        kubernetes.io/hostname: node101
---
apiVersion: v1
kind: Service
metadata:
  labels:
   app: elk
  name: ssx-elk-sv
  namespace: ssx
spec:
  ports:
  - port: 9000 #我暂时不理解,这个设置 明明没用到?
    name: ssx-elk-last9200
    protocol: TCP
    targetPort: 9200 # 容器nginx对外开放的端口 上面的dm已经指定了
    nodePort: 30600 #外网访问的端口

  - port: 9010 #我暂时不理解,这个设置 明明没用到?
    name: ssx-elk-last9300
    protocol: TCP
    targetPort: 9300 # 容器nginx对外开放的端口 上面的dm已经指定了
    nodePort: 30601 #外网访问的端口

  - port: 9011 #我暂时不理解,这个设置 明明没用到?
    name: ssx-kibana
    protocol: TCP
    targetPort: 5601 # 容器nginx对外开放的端口 上面的dm已经指定了
    nodePort: 30602 #外网访问的端口

  selector:
    app: elk
  type: NodePort 



Guess you like

Origin blog.csdn.net/weixin_48835367/article/details/128751984