Implementing ELK+Kafka to build a distributed log collection system based on Docker-Compose

ELK Overview

ELK refers to the three main open source software tools Elasticsearch, Logstash and Kibana, which are often used together for real-time log analysis and data visualization. In addition, in log collection systems, ELK is usually used in conjunction with Kafkka.

1、Elasticsearch

Elasticsearch is an open source distributed search and analysis engine built on the Lucene library. It is designed to handle large-scale data sets, enabling fast full-text search, structured search, analysis and real-time data processing. Elasticsearch is highly scalable and reliable, can automatically handle data sharding and replication, and supports distributed search and aggregation operations.

2.Logstash:

Logstash is an open source data collection and processing engine used to collect, process and transmit various types of data (such as logs, events, metrics, etc.) from multiple sources to Elasticsearch or other storage and analysis tools. Logstash supports multiple data input sources and output destinations, and can perform data conversion, standardization, filtering and enhancement to make the data consistent and structured.

3. Kibana:

Kibana is an open source data visualization platform for creating and sharing real-time data visualization dashboards on Elasticsearch. It provides a rich set of visual components such as charts, tables, maps, and dashboards, allowing users to explore and analyze data in an intuitive way. Kibana also supports interactive querying and filtering, enabling you to quickly demonstrate and share insights from your data.

4.Kafka

Kafka is a data buffer queue. Being a message queue decouples the processing while improving scalability. With peak processing capabilities, using message queues can enable key components to withstand sudden access pressure without completely collapsing due to sudden overloaded requests.

Build and configure

Use Docker Compose to implement ELK (Elasticsearch, Logstash, Kibana) and Kafka log collection

docker-compose.yml

Create the docker-compose.yml file and use the following services:

ZooKeeper:用于Kafka的依赖服务,监听在2181端口

Kafka:用于消息队列和日志采集,监听在9092端口,并连接到ZooKeeper

Elasticsearch:用于存储和索引日志数据,监听在9200端口

Logstash:用于从Kafka接收日志数据并转发到Elasticsearch

Kibana:用于可视化和检索日志数据,监听在5601端口,并连接到Elasticsearch

vim docker-compose.yml

version: '3.7'
services:
  zookeeper:
    image: zookeeper:3.8
    container_name: zookeeper
    ports:
      - "2181:2181"
    environment:
      - ALLOW_ANONYMOUS_LOGIN=yes  
    restart: always
  kafka:
    image: bitnami/kafka:3.3.2
    container_name: kafka1
    hostname: kafka
    volumes:
      - ./kafka_data:/bitnami/kafka  # 赋予kafka_data目前权限:chmod 777 kafka_data
    ports:
      - "9092:9092"
    depends_on:
      - zookeeper  
    environment:
        KAFKA_CFG_ZOOKEEPER_CONNECT: zookeeper:2181 #kafka链接的zookeeper地址
        KAFKA_ENABLE_KRAFT: no # 是否使用kraft,默认值:是,Kafka替代Zookeeper
        KAFKA_CFG_LISTENERS: PLAINTEXT://:9092 # 定义kafka服务端socket监听端口,默认值:PLAINTEXT://:9092,CONTROLLER://:9093
        KAFKA_CFG_ADVERTISED_LISTENERS: PLAINTEXT://192.168.30.30:9092 # 定义外网访问地址(宿主机ip地址和端口),默认值:PLAINTEXT://:9092
        KAFKA_KRAFT_CLUSTER_ID: FDAF211E728140229F6FCDF4ADDC0B32 # 使用Kafka时的集群id,集群内的Kafka都要用这个id做初始化,生成一个UUID即可
        ALLOW_PLAINTEXT_LISTENER: yes # 允许使用PLAINTEXT监听器,默认false,不建议在生产环境使用
        KAFKA_HEAP_OPTS: -Xmx512M -Xms256M # 设置broker最大内存,和初始内存
        KAFKA_BROKER_ID: 1 # broker.id,必须唯一
    restart: always
  elasticsearch:
    image: elasticsearch:7.4.2
    container_name: elasticsearch
    hostname: elasticsearch
    volumes:
      - ./es_data:/usr/share/elasticsearch/data # 赋予es_data目前权限:chmod 777 es_data
    restart: always
    environment:
       - "discovery.type=single-node"
       - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ports:
    - "9200:9200"
  logstash:
    image: logstash:7.4.2
    container_name: logstash
    volumes:
      - ./logstash.conf:/usr/share/logstash/pipeline/logstash.conf
      - ./logstash.yml:/usr/share/logstash/config/logstash.yml
    depends_on:
      - elasticsearch
    environment:
      LS_JAVA_OPTS: "-Xmx256m -Xms128m"
      ELASTICSEARCH_HOST: "http://192.168.30.30:9200"
  kibana:
    image: kibana:7.4.2
    restart: always
    container_name: kibana1
    ports:
    - 5601:5601
    environment:
      ELASTICSEARCH_URL: "http://192.168.30.30:9200"
    depends_on:
      - elasticsearch  

Configure log collection rules

Create kafka's data storage directory and grant the current permissions:

chmod 777 kafka_data

Create the ES data storage directory and grant the current permissions:

chmod 777 es_data

Create the logstash.yml configuration file and modify the ES connection address

http.host: "0.0.0.0"
xpack.monitoring.elasticsearch.hosts: [ "http://192.168.30.30:9200" ]

Create the logstash.conf configuration file and define log collection rules

input {
    
    
   kafka {
    
    
     bootstrap_servers => "192.168.30.30:9092"
     topics => "user_logs"
 }
}
filter {
    
    
  
}

output {
    
    
  elasticsearch {
    
    
    hosts  => "192.168.30.30:9200"
    index  => "user_logs"
 }
} 

Start service

In the docker-compose.yml file directory, run the following command to start the service:

docker-compose up -d

After all containers are started, you can access Kibana by accessinghttp://192.168.30.30:5601/ to start visualizing and querying the log data
Insert image description here

Simulate sending log messages

Create a log sending queue, cooperate with the thread, obtain the log content from the queue, and then send the message to MQ asynchronously

Log sending queue

@Component
public class LogDeque {
    
    
    /**
     * 本地队列
     */
    private static LinkedBlockingDeque<String> logMsgs = new LinkedBlockingDeque<>();
    
    @Autowired
    private KafkaTemplate<String, Object> kafkaTemplate;

    public void log(String msg) {
    
    
        logMsgs.offer(msg);
    }

    public LogDeque() {
    
    
        new LogThread().start();
    }

    /**
     * 创建线程,从队列中获取日志内容,然后以异步的形式发送消息到MQ
     */
    class LogThread extends Thread {
    
    
        @Override
        public void run() {
    
    
            while (true) {
    
    
                String msgLog = logMsgs.poll();
                if (!StringUtils.isEmpty(msgLog)) {
    
    
                    // 发送消息
                    kafkaTemplate.send("user_logs", msgLog);
                }
                // 避免cpu飙高的问题
                try {
    
    
                    Thread.sleep(200);
                } catch (Exception e) {
    
    
                    e.printStackTrace();
                }
            }
        }
    }
}

Log aspect

@Aspect
@Component
@Slf4j
public class AopLogAspect {
    
    
    @Autowired
    private LogDeque logDeque;

    /**
     * 申明一个切点 execution表达式
     */
    @Pointcut("execution(* cn.ybzy.demo.controller.*.*(..))")
    private void logAspect() {
    
    
    }

    /**
     * 请求method前打印内容
     *
     * @param joinPoint
     */
    @Before(value = "logAspect()")
    public void methodBefore(JoinPoint joinPoint) {
    
    
        ServletRequestAttributes requestAttributes = (ServletRequestAttributes) RequestContextHolder.getRequestAttributes();
        HttpServletRequest request = requestAttributes.getRequest();

        JSONObject jsonObject = new JSONObject();
        // 设置日期格式
        SimpleDateFormat df = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
        jsonObject.put("request_time", df.format(new Date()));
        jsonObject.put("request_url", request.getRequestURL().toString());
        jsonObject.put("request_ip", request.getRemoteAddr());
        jsonObject.put("request_method", request.getMethod());
        jsonObject.put("request_args", Arrays.toString(joinPoint.getArgs()));

        // 将日志信息投递到MQ
        String logMsg = jsonObject.toJSONString();
        log.info("<AOP日志 ===》 MQ投递消息:{}>", logMsg);

        // 投递msg
        logDeque.log(logMsg);
    }
}

Configure application.yaml

server:
  port: 8888

spring:
  datasource:
    driver-class-name: com.mysql.jdbc.Driver
    url: jdbc:mysql://127.0.0.1:3306/demo?useUnicode=true&characterEncoding=utf-8&useSSL=false&allowMultiQueries=true&serverTimezone=UTC
    username: root
    password: 123456



  application:
    # 服务的名称
    name: elkk
  jackson:
    date-format: yyyy-MM-dd HH:mm:ss
  kafka:
    bootstrap-servers: 192.168.30.30:9092 # 指定kafka server的地址,集群配多个,中间,逗号隔开
    producer:
      key-serializer: org.apache.kafka.common.serialization.StringSerializer
      value-serializer: org.apache.kafka.common.serialization.StringSerializer
    consumer:
      group-id: default_consumer_group # 群组ID
      enable-auto-commit: true
      auto-commit-interval: 1000
      key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
      value-deserializer: org.apache.kafka.common.serialization.StringDeserializer

Send log message

@Slf4j
@RestController
public class TestController {
    
    

    @RequestMapping("/test")
    public String test() {
    
    
        return "OK";
    }
}

Use of Kibana

To use Kibana, you need to tell it the Elasticsearch index you want to explore by configuring one or more index patterns. The index pattern is the prerequisite for Kibana visualization. It is equivalent to telling kibana which indexes to use as data for visual display.

Create index schema

An index pattern identifies one or more Elasticsearch indexes that you want to explore through kiabna. Kibana looks for index names that match the specified pattern. An asterisk (*) in a pattern matches zero or more characters.

Find management in the left menu, then click index patterns --> create index pattern. By entering the name of index pattern, kibana will automatically display the matching index and click next

Insert image description here

In the Configure settings, select the time dimension field in the index. This time field is used to facilitate filtering data based on time. Select@timestamp from the drop-down menu here and click Create Index Mode

Insert image description here

Discovery search data

ClickDiscovery to view log-related data and search logs
Insert image description here

Visualize data

Kibana comes with many visual components to facilitate visual display of aggregated results.

SelectVisualize from the left menu, then click on the + number on the right
Insert image description here

ELK+RabbitMQ

After building the ELK+Kafka log collection system, you can build the ELK+RabbitMQ log collection system in the same way. Their implementation methods are similar. The following is for reference:

Send log message

	@Autowired
	private RabbitTemplate rabbitTemplate;
	
	@Test
    public void logs(String logMsg)  {
    
    
        for (int i = 0; i < 500000; i++) {
    
    
            rabbitTemplate.convertAndSend("elk_logs_exchange", "user_logs",logMsg);
            try {
    
    
                Thread.sleep(500);
            } catch (InterruptedException e) {
    
    
                throw new RuntimeException(e);
            }
        }
    }

Configure log collection rules

Create the logstash.conf configuration file and define log collection rules

input {
    
    
  rabbitmq {
    
    
    host => "192.168.30.30:5672"
    user => "work"
    password => "12345678"
    vhost => "/"
    queue => "user_logs"
    durable => true
    exchange => "elk_logs_exchange"
    key => "user.logs"
    codec => "json"
  }
}

output {
    
    
  elasticsearch {
    
    
    hosts => ["192.168.30.30:9200"]
    index => "base_logs"
  }
}

Guess you like

Origin blog.csdn.net/qq_38628046/article/details/114767113