Spring Boot collection (3): Spring Boot integrates Kafka | Installation and configuration of Zookeeper/Kafka | The summary is very detailed

foreword

Before studying this chapter, be sure to do the following preparations:

1. Install and start Zookeeper [ official website ] , if you need help, click to enter ;

2. Installed and started Kafka [ official website ] , if you need help, click to enter .

Note: The installation and introduction of zk and kafka, this article does not focus on the introduction, please refer to the link above for details.


¥¥¥¥¥¥¥¥¥¥¥¥¥¥¥¥¥¥¥¥¥¥¥¥¥¥¥¥¥¥_ _


1. Preparation

1.1. Create a new Spring Boot 2.x Web project

1.1.1. Demonstration of project creation steps

 Make sure to check the options below!

1.1.2. Project catalog display

Note: After the project is successfully created, first create the package and java files to pave the way for the following code writing work.

1.2, pom.xml add spring-kafka related dependencies

Note: There are three main dependencies added in it, namely, the system has been automatically configured, kafka core dependencies + test dependencies, and other related auxiliary dependencies (for example: lombok).

 <?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.7.4</version>
        <relativePath/> <!-- lookup parent from repository -->
    </parent>
    <groupId>com.succ</groupId>
    <artifactId>SpringBootKafaka</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>SpringBootKafaka</name>
    <description>Demo project for Spring Boot</description>
    <properties>
        <java.version>1.8</java.version>
    </properties>
    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-devtools</artifactId>
            <scope>runtime</scope>
            <optional>true</optional>
        </dependency>
        <dependency>
            <groupId>org.projectlombok</groupId>
            <artifactId>lombok</artifactId>
            <optional>true</optional>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>

        <!-- Kafka -->
        <dependency>
            <groupId>org.springframework.kafka</groupId>
            <artifactId>spring-kafka</artifactId>
        </dependency>

        <dependency>
            <groupId>org.springframework.kafka</groupId>
            <artifactId>spring-kafka-test</artifactId>
            <scope>test</scope>
        </dependency>
        <!-- 阿里巴巴 fastjson -->
        <dependency>
            <groupId>com.alibaba</groupId>
            <artifactId>fastjson</artifactId>
            <version>1.2.58</version>
        </dependency>

    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
        </plugins>
    </build>

</project>

1.3. In the application.yml file, add kafka related configuration

spring:
  kafka:
    # 指定 kafka 地址,我这里部署在的虚拟机,开发环境是Windows,kafkahost是虚拟机的地址, 若外网地址,注意修改为外网的IP( 集群部署需用逗号分隔)
    bootstrap-servers: kafkahost:9092
    consumer:
      # 指定 group_id
      group-id: group_id
      auto-offset-reset: earliest
      # 指定消息key和消息体的序列化方式
      key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
      value-deserializer: org.apache.kafka.common.serialization.StringDeserializer
    producer:
      # 指定消息key和消息体的序列化方式
      key-deserializer: org.apache.kafka.common.serialization.StringSerializer
      value-deserializer: org.apache.kafka.common.serialization.StringSerializer

Note: The kafkahost alias needs to be configured separately . If you need help , click to enter ; of course, you can also directly write the IP address of the virtual machine here (because the development environment is Windows, and kafka is deployed on the virtual machine, so you cannot write localhost here (equivalent to 127.0 .0.1) , otherwise the access is the localhost of windows, and the kafka of the virtual machine cannot be accessed at all). 

auto.offset.reset has 3 values ​​that can be set: 

earliest: When there is a submitted offset under each partition, start consumption from the submitted offset; when there is no submitted offset, start consumption from the beginning;


latest: When there is a submitted offset under each partition, start to consume from the submitted offset; when there is no submitted offset, consume the newly generated data under the partition;


none: When there is a submitted offset in each partition of the topic, consumption will start after the offset; as long as there is a partition without a submitted offset, an exception will be thrown;


By default, it is recommended to use earliest . After setting this parameter, Kafka restarts after an error occurs. If you find unconsumed offsets, you can continue to consume.

The latest setting is easy to lose messages . If there is a problem with kafka and there is still data written to the topic, restart kafka at this time, this setting will start consumption from the latest offset, and those that have problems in the middle will be ignored. 

Note: For more detailed configuration information, see the extension at the bottom 

2. Code writing

2.1, Order (order) Entity Bean encoding

package model;

import lombok.*;

import java.time.LocalDateTime;

/**
 * @create 2022-10-08 1:25
 * @describe 订单类javaBean实体
 */
@Data
@Builder
@AllArgsConstructor
@NoArgsConstructor
public class Order {
    /**
     * 订单id
     */
    private long orderId;
    /**
     * 订单号
     */
    private String orderNum;
    /**
     * 订单创建时间
     */
    private LocalDateTime createTime;
}

2.2. Writing of KafkaProvider (message provider)

package com.succ.springbootkafaka.provider;

import com.alibaba.fastjson.JSONObject;
import com.succ.springbootkafaka1.model.Order;
import lombok.extern.slf4j.Slf4j;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.kafka.core.KafkaTemplate;
import org.springframework.kafka.support.SendResult;
import org.springframework.stereotype.Component;
import org.springframework.util.concurrent.ListenableFuture;
import org.springframework.util.concurrent.ListenableFutureCallback;

import java.time.LocalDateTime;

/**
 * @create 2022-10-14 21:39
 * @describe 话题的创建类,使用它向kafka中创建一个关于Order的订单主题
 */
@Component
@Slf4j
public class KafkaProvider {
    /**
     * 消息 TOPIC
     */
    private static final String TOPIC = "shopping";

    @Autowired
    private KafkaTemplate<String, String> kafkaTemplate;

    public void sendMessage(long orderId, String orderNum, LocalDateTime createTime) {
        // 构建一个订单类
        Order order = Order.builder()
                .orderId(orderId)
                .orderNum(orderNum)
                .createTime(createTime)
                .build();

        // 发送消息,订单类的 json 作为消息体
        ListenableFuture<SendResult<String, String>> future =
                kafkaTemplate.send(TOPIC, JSONObject.toJSONString(order));

        // 监听回调
        future.addCallback(new ListenableFutureCallback<SendResult<String, String>>() {
            @Override
            public void onFailure(Throwable throwable) {
                log.info("## Send message fail ...");
            }

            @Override
            public void onSuccess(SendResult<String, String> result) {
                log.info("## Send message success ...");
            }
        });
    }
}

2.3, KafkaConsumer (consumer) code writing

package consumer;

import lombok.extern.slf4j.Slf4j;
import org.springframework.kafka.annotation.KafkaListener;
import org.springframework.stereotype.Component;

/**
 * @create 2022-10-08 1:25
 * @describe 通过指定的话题和分组来消费对应的话题
 */
@Component
@Slf4j
public class KafkaConsumer {
    @KafkaListener(topics = "shopping", groupId = "group_id") //这个groupId是在yml中配置的
    public void consumer(String message) {
        log.info("## consumer message: {}", message);
    }
}

3. Unit testing

3.1. Preparations

3.1.1. View the startup status of Zookeeper

Use the cd command to enter the zk installation directory

Start the zk single-node instance through the zookeeper-server-start.sh startup script in the bin directory :

bin/zookeeper-server-start.sh -daemon config/zookeeper.properties 

 Since the basic configuration of this virtual machine zk is in place, it can be started directly ( if you need help with the installation of zk , click to enter )

#zkServer.sh status 查看服务状态
#zkServer.sh  start 启动zk
#zkServer.sh  stop  停掉zk
#zkServer.sh  restart  重启zk

As shown in the figure above, the startup mode of zk is standalone singleton mode (non-cluster), and it has been started.

3.1.2, start kafka

Use the cd command to enter the bin directory under the kafka installation directory

Enter the decompression directory, and start Kafka in the background through the kafka-server-start.sh script in the bin directory

pwd 
./kafka-server-start.sh  ../config/server.properties 

 

Start normally, as shown in the figure below:

Reminder: The solution to the problem that the Kafka startup error does not recognize the host name, click to enter .

java.net.UnknownHostException|unknown error at java.net.Inet6AddressImpl.lookupAllHost

3.1.3. Three ways to view the startup status of kafka

jps -ml #方式一,通过jps命令查看(尾部的-ml为非必须参数)

netstat -nalpt | grep 9092  #方式二,通过查看端口号查看
 lsof -i:9092 #方式三

3.2, unit test code writing

package com.succ.springbootkafaka;

import com.succ.springbootkafaka.provider.KafkaProvider;
import org.junit.jupiter.api.Test;//注意,这个junit用自带的就可以
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;

import java.time.LocalDateTime;
import java.util.UUID;
import java.util.concurrent.TimeUnit;

@SpringBootTest
class SpringBootKafakaApplicationTests {

    @Autowired
    private KafkaProvider kafkaProvider;

    @Test
    public void sendMessage() throws InterruptedException {
       //如果这里打印为null,要么是zk或kafka没正常启动,此时进入linux分别查看他们状态接口,另外也需要排查一下你的yum文件配置的kafka的地址,最后排查自己的注解是否引入错误的package
        System.out.println("是否为空??+"+kafkaProvider);
        // 发送 1000 个消息
        for (int i = 0; i < 1000; i++) {
            long orderId = i+1;
            String orderNum = UUID.randomUUID().toString();
            kafkaProvider.sendMessage(orderId, orderNum, LocalDateTime.now());
        }

        TimeUnit.MINUTES.sleep(1);
    }

}

3.3. Test

3.3.1. Send 1000 messages to see if the messages can be published and consumed normally

The console log is as follows:

3.3.2. Check the topic list of Kafka to see if the topic "shopping" is created normally

Execute the kafka-topics.sh script to view the topic list in the bin directory :

Note: If your kafka version is higher than 2.2+=, use the following command to view

 bin/kafka-topics.sh --list --bootstrap-server kafkahost:9092

As shown in the figure above, you can see the theme shopping  just created

Note: If your kafka version is lower than 2.2-, use the following command to view

bin/kafka-topics.sh --list --zookeeper kafkahost:2181

 The kafkahost above is configured in vim /etc/host , and the IP is obtained through the ifconfig command

 

So far, the test is successful!

4. Why start zk first, then start kafka

Because the operation of kafka depends on the startup of zk.
Specifically, you can enter the /conf/ directory of the kafka decompression directory

cd /usr/src/kafka_2.13-3.3.1/config/ && ls
vi server.properties

 

For more kafka tutorials , click to enter 

5. Finishing work

1. Close Zookeeper

zkServer.sh status
zkServer.sh stop
zkServer.sh status

  

2. Close kafka

cd /usr/src/kafka_2.13-3.3.1/ && ls
jps 
bin/kafka-server-stop.sh
jps

 

Note: If you are installing and using kafka for the first time, the shutdown command will not take effect. You need to enter the configuration file of kafka to make some changes to the configuration, as follows: 

vim bin/kafka-server-stop.sh 

 Find the code in the figure below and modify it

#PIDS=$(ps ax | grep ' kafka\.Kafka ' | grep java | grep -v grep | awk '{print $1}') PIDS=$(jps -lm
| grep -i 'kafka.Kafka' | awk '{print $1}')

The function of the modified command: use the jps -lm command to list all the java processes, and then use the pipeline to filter out the kafka process with the grep -i 'kafka. take out.

 

Summarize

Through a small case, this article initially introduces the integration of Spring Boot to Kafka, and completes the calling of producers (creating topics to Kafka) and consumers (consuming information) from Spring Boot to complete calls to Kafka.

Of course, the use of kafka is far more than that, and more and more in-depth introductions will be introduced in different pages later.

Epilogue

Follow the path traveled by the predecessors and step on the pit for the latecomers.

In the process of integration, it is inevitable to encounter bumps and bumps. Fortunately, I am walking with you, and some pitfalls encountered in it are basically marked.

If you think the article is not bad, welcome to like and collect!

expand

More detailed configuration about yum files

spring:
  kafka:
    bootstrap-servers: 172.101.203.33:9092
    producer:
      # 发生错误后,消息重发的次数。
      retries: 0
      #当有多个消息需要被发送到同一个分区时,生产者会把它们放在同一个批次里。该参数指定了一个批次可以使用的内存大小,按照字节数计算。
      batch-size: 16384
      # 设置生产者内存缓冲区的大小。
      buffer-memory: 33554432
      # 键的序列化方式
      key-serializer: org.apache.kafka.common.serialization.StringSerializer
      # 值的序列化方式
      value-serializer: org.apache.kafka.common.serialization.StringSerializer
      # acks=0 : 生产者在成功写入消息之前不会等待任何来自服务器的响应。
      # acks=1 : 只要集群的首领节点收到消息,生产者就会收到一个来自服务器成功响应。
      # acks=all :只有当所有参与复制的节点全部收到消息时,生产者才会收到一个来自服务器的成功响应。
      acks: 1
    consumer:
      # 自动提交的时间间隔 在spring boot 2.X 版本中这里采用的是值的类型为Duration 需要符合特定的格式,如1S,1M,2H,5D
      auto-commit-interval: 1S
      # 该属性指定了消费者在读取一个没有偏移量的分区或者偏移量无效的情况下该作何处理:
      # latest(默认值)在偏移量无效的情况下,消费者将从最新的记录开始读取数据(在消费者启动之后生成的记录)
      # earliest :在偏移量无效的情况下,消费者将从起始位置读取分区的记录
      auto-offset-reset: earliest
      # 是否自动提交偏移量,默认值是true,为了避免出现重复数据和数据丢失,可以把它设置为false,然后手动提交偏移量
      enable-auto-commit: false
      # 键的反序列化方式
      key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
      # 值的反序列化方式
      value-deserializer: org.apache.kafka.common.serialization.StringDeserializer
    listener:
      # 在侦听器容器中运行的线程数。
      concurrency: 5
      #listner负责ack,每调用一次,就立即commit
      ack-mode: manual_immediate
      missing-topics-fatal: false

notes 

1. Download and installation of ZK/Zookeeper | Quick construction of true/false clusters | The summary is very detailed

2. Kafka collection (1): Kafka introduction and installation, summarized in detail

Guess you like

Origin blog.csdn.net/xp871038951/article/details/127353800