Chapter 1 Introduction
As a high-throughput distributed publish-subscribe messaging system, Kafka already occupies a very important position in today's big data ecosystem, and also plays an important role in system peak-shaving and decoupling scenarios. Its natural distributed architecture can be flexibly expanded and can support large throughput data transmission very well. Because of its important position in big data scenarios, and Kafka needs to be used in subsequent article cases, the author will introduce a separate chapter here.
Chapter 2 Cluster Planning
Kafka author only install 2 units here
In addition, because Kafka needs to use Zookeeper, please refer to the author’s previous article for Zookeeper related deployment: Big Data Practice No.1-Zookeeper cluster construction
It is said that subsequent versions of Kafka will gradually abandon Zookeeper and continue to pay attention!
|
Kafka120 |
Kafka121 |
Kafka |
√ |
√ |
Chapter 3 Download and Unzip
Download directly from the official website, official website address: http://kafka.apache.org
Unzip:
$ tar -zxvf kafka_2.12-2.5.0.tgz -C /opt/module/
Chapter 4 Modify Configuration
4.1 Modify environment variables
Modify /etc/profile and add the following
#KAFKA_HOME
export KAFKA_HOME=/opt/module/kafka_2.12-2.5.0
export PATH=$PATH:$KAFKA_HOME/bin
source it
$ source /etc/profile
4.2 Create a data directory
Go to the kafka decompression directory and create a data directory. The default is the logs folder. It is recommended to create a data directory separately from the log files in the default logs folder.
$ mkdir data
4.3 Modify the configuration file
4.3.1 Modify server.propertie
Modify the kafka configuration file, server.properties in the config directory, and modify the content as follows:
#Kafka集群节点ID(int型,集群中唯一)
broker.id=0
#允许删除topic
delete.topic.enable=true
#数据+日志保存路径
log.dirs= /opt/module/kafka_2.12-2.5.0/logs
#分区数量
num.partitions=2
#数据存储时间(默认7天)
log.retention.hours=168
segment最大数据量(默认1G)
log.segment.bytes=1073742824
#zookeeper集群地址
zookeeper.connect=zookeeper110:2181, zookeeper111:2181, zookeeper112:2181
4.3.2 Modify consumer.properties
Modify the kafka configuration file, consumer.properties in the config directory, and modify the content as follows:
bootstrap.servers=kafka120:9092,kafka121:9092
Note: In the old version of Kafka (before 0.11), the consumer still recommends to connect to ZooKeeper. In later versions, it is recommended or even forbidden to connect to ZooKeeper. The consumer directly connects to Kafka itself, and the offset and other information are directly stored locally in Kafka.
Chapter 5 Starting/Stopping Kafka
5.1 Single point start
$ bin/kafka-server-start.sh -daemon config/server.properties
Note: -daemon means to start in the background, not filling in -daemon will occupy the window in the foreground.
Kafka only provides single-point startup scripts by default, and you need to write your own scripts. A group script is provided below for reference.
5.2 Group Script
On kafka120 and kafka121, configure environment variables to the current user environment variable directory
$ cat /etc/profile >> ~/.bashrc
Create a group script file first
$ touch kafka-all.sh
$ chmod 777 kafka-all.sh
$ vi kafka-all.sh
Fill in the following:
#!/bin/bash
case $1 in
"start"){
for i in kafka120 kafka121
do
ssh $i "source /etc/profile && export JMX_PORT=9988 && /opt/module/kafka_2.12-2.5.0/bin/kafka-server-start.sh -daemon /opt/module/kafka_2.12-2.5.0/config/server.properties"
echo "=============$i==========="
done
};;
"stop"){
for i in kafka120 kafka121
do
ssh $i "source /etc/profile && /opt/module/kafka_2.12-2.5.0/bin/kafka-server-stop.sh"
echo "=============$i==========="
done
};;
esac
5.3 Start Kafka cluster
Note, please start zookeeper before starting
$ bin/kafka-all.sh start
5.4 Stop the Kafka cluster
$ bin/kafka-all.sh stop
Chapter 6 Command Test
6.1 Start the console consumer
[zihao@kafka121 kafka_2.12-2.5.0]$ bin/kafka-console-consumer.sh --topic test --bootstrap-server kafka120:9092,kafka121:9092
6.2 Start the console producer
[zihao@kafka120 kafka_2.12-2.5.0]$ bin/kafka-console-producer.sh --topic test --broker-list kafka120:9092,kafka121:9092
>
6.3 Production data
[zihao@kafka120 kafka_2.12-2.5.0]$ bin/kafka-console-producer.sh --topic test --broker-list kafka120:9092,kafka121:9092
>hello kafka
>hello bigdata
>
Corresponding screenshot:
6.4 View consumers
[zihao@kafka120 kafka_2.12-2.5.0]$ bin/kafka-console-producer.sh --topic test --broker-list kafka120:9092,kafka121:9092
>hello kafka
>hello bigdata
>
Corresponding screenshot:
You can see that the data has been produced and consumed normally.
So far, Kafka has been deployed and configured. In the next chapter, we will use the Kafka->Flink-Mysql scenario to share with you how to use Flink in the development environment!
Welcome to scan the QR code below and pay attention to the author's official account: Knowing the numbers and principles , the articles will gradually be synchronized to the WeChat official account, and continue to update, so that more friends can learn and communicate together: