Big Data Practice No.12-Kafka cluster deployment and use

Chapter 1 Introduction

As a high-throughput distributed publish-subscribe messaging system, Kafka already occupies a very important position in today's big data ecosystem, and also plays an important role in system peak-shaving and decoupling scenarios. Its natural distributed architecture can be flexibly expanded and can support large throughput data transmission very well. Because of its important position in big data scenarios, and Kafka needs to be used in subsequent article cases, the author will introduce a separate chapter here.

 

Chapter 2 Cluster Planning

Kafka author only install 2 units here

In addition, because Kafka needs to use Zookeeper, please refer to the author’s previous article for Zookeeper related deployment: Big Data Practice No.1-Zookeeper cluster construction

It is said that subsequent versions of Kafka will gradually abandon Zookeeper and continue to pay attention!

 

Kafka120

Kafka121

Kafka

 

Chapter 3 Download and Unzip

Download directly from the official website, official website address: http://kafka.apache.org

Unzip:

$ tar -zxvf kafka_2.12-2.5.0.tgz -C /opt/module/

 

Chapter 4 Modify Configuration

4.1 Modify environment variables

Modify /etc/profile and add the following

#KAFKA_HOME
export KAFKA_HOME=/opt/module/kafka_2.12-2.5.0
export PATH=$PATH:$KAFKA_HOME/bin

source it

$ source /etc/profile

4.2 Create a data directory

Go to the kafka decompression directory and create a data directory. The default is the logs folder. It is recommended to create a data directory separately from the log files in the default logs folder.

$ mkdir data

4.3 Modify the configuration file

4.3.1 Modify server.propertie

Modify the kafka configuration file, server.properties in the config directory, and modify the content as follows:

#Kafka集群节点ID(int型,集群中唯一)
broker.id=0

#允许删除topic
delete.topic.enable=true

#数据+日志保存路径
log.dirs= /opt/module/kafka_2.12-2.5.0/logs

#分区数量
num.partitions=2

#数据存储时间(默认7天)
log.retention.hours=168
segment最大数据量(默认1G)
log.segment.bytes=1073742824

#zookeeper集群地址
zookeeper.connect=zookeeper110:2181, zookeeper111:2181, zookeeper112:2181

4.3.2 Modify consumer.properties

Modify the kafka configuration file, consumer.properties in the config directory, and modify the content as follows:

bootstrap.servers=kafka120:9092,kafka121:9092

Note: In the old version of Kafka (before 0.11), the consumer still recommends to connect to ZooKeeper. In later versions, it is recommended or even forbidden to connect to ZooKeeper. The consumer directly connects to Kafka itself, and the offset and other information are directly stored locally in Kafka.

 

Chapter 5 Starting/Stopping Kafka

5.1 Single point start

$ bin/kafka-server-start.sh -daemon config/server.properties

Note: -daemon means to start in the background, not filling in -daemon will occupy the window in the foreground.

Kafka only provides single-point startup scripts by default, and you need to write your own scripts. A group script is provided below for reference.

5.2 Group Script

On kafka120 and kafka121, configure environment variables to the current user environment variable directory

$ cat /etc/profile >> ~/.bashrc

Create a group script file first  

$ touch kafka-all.sh
$ chmod 777 kafka-all.sh
$ vi kafka-all.sh

 Fill in the following:

#!/bin/bash
case $1 in
"start"){
        for i in kafka120 kafka121
        do
                ssh $i "source /etc/profile && export JMX_PORT=9988 && /opt/module/kafka_2.12-2.5.0/bin/kafka-server-start.sh -daemon /opt/module/kafka_2.12-2.5.0/config/server.properties"
                echo "=============$i==========="
        done
};;
"stop"){
        for i in kafka120 kafka121
        do
                ssh $i "source /etc/profile && /opt/module/kafka_2.12-2.5.0/bin/kafka-server-stop.sh"
                echo "=============$i==========="
        done
};;
esac

5.3 Start Kafka cluster

Note, please start zookeeper before starting

$ bin/kafka-all.sh start

5.4 Stop the Kafka cluster

$ bin/kafka-all.sh stop

 

Chapter 6 Command Test

6.1 Start the console consumer

[zihao@kafka121 kafka_2.12-2.5.0]$ bin/kafka-console-consumer.sh --topic test --bootstrap-server kafka120:9092,kafka121:9092

6.2 Start the console producer

[zihao@kafka120 kafka_2.12-2.5.0]$ bin/kafka-console-producer.sh --topic test --broker-list kafka120:9092,kafka121:9092
>

6.3 Production data

[zihao@kafka120 kafka_2.12-2.5.0]$ bin/kafka-console-producer.sh --topic test --broker-list kafka120:9092,kafka121:9092
>hello kafka
>hello bigdata
>

Corresponding screenshot:

6.4 View consumers

[zihao@kafka120 kafka_2.12-2.5.0]$ bin/kafka-console-producer.sh --topic test --broker-list kafka120:9092,kafka121:9092
>hello kafka
>hello bigdata
>

Corresponding screenshot:

You can see that the data has been produced and consumed normally.

 

So far, Kafka has been deployed and configured. In the next chapter, we will use the Kafka->Flink-Mysql scenario to share with you how to use Flink in the development environment!


Welcome to scan the QR code below and pay attention to the author's official account: Knowing the numbers and principles , the articles will gradually be synchronized to the WeChat official account, and continue to update, so that more friends can learn and communicate together:

Guess you like

Origin blog.csdn.net/dzh284616172/article/details/108630229