Technology Sharing | Message Queue Kafka Cluster Deployment

1. Introduction

1 Introduction

Kafka was originally developed by Linkedin. It is a distributed, partitioned, multi-copy, multi-subscriber, distributed log system based on zookeeper coordination (also can be used as an MQ system). It can be commonly used for web/nginx logs, access Logs, messaging services, etc., Linkedin contributed to the Apache Foundation in 2010 and became a top open source project.

2. The main application scenarios are
  • Log collection: You can use Kafka to collect logs of various services, and open them to various consumers through a unified interface.

  • Message system: Decouple production and consumers, cache messages.

  • User activity tracking: kafka can record various activities of webapp or app users, such as browsing the web, clicking and other activities, these activities can be sent to kafka, and then subscribers can monitor by subscribing to these messages.

  • Operational indicators: can be used to monitor various data.

3. The main design goals of Kafka are as follows:
  • The message persistence capability is provided with a time complexity of O(1), and constant-time access performance can be guaranteed even for data above TB level.
  • High throughput. Even on very cheap commercial machines, a single machine can support the transmission of 100K messages per second.
  • Supports message partitioning between Kafka Servers and distributed consumption, while ensuring the sequential transmission of messages in each partition.
  • Both offline data processing and real-time data processing are supported.
  • Scale out: support line horizontal expansion
4. Basic concepts

Kafka is a distributed partitioned message that provides the functions that a message system should have.

name explain
broker Message middleware processing node, a broker is a Kafka node, and multiple brokers form a Kafka cluster.
topic Kafka classifies according to the message, and each message published to Kafka has a corresponding topic
producer Message producer (publisher)
consumer message consumer (subscriber)
consumergroup In a message subscription cluster, a message can be consumed by multiple consumer groups, but only one consumer in a consumer group can consume messages.

2. Environmental preparation

当前环境:centos7.9三台
软件版本:kafka_2.13-3.0.0
环境目录:/usr/local/kafka

下载kafka;包含了zookeeper(三台机器都要操作)

[root@localhost opt]# wget https://archive.apache.org/dist/kafka/3.0.0/kafka_2.13-3.0.0.tgz
[root@localhost opt]# tar zxvf kafka_2.13-3.0.0.tgz
[root@localhost opt]# mv kafka_2.13-3.0.0 /usr/local/kafka

配置环境变量(三台机器都要操作)

[root@localhost opt]# vim /etc/profile
## 末尾添加
export ZOOKEEPER_HOME=/usr/local/kafka
export PATH=$PATH:$ZOOKEEPER_HOME/bin

## 加载环境变量
[root@localhost opt]# source /etc/profile

3. Zookeeper installation

Since metadata such as Kafka partition location and topic configuration are stored in the ZooKeeper cluster, the zookeeper cluster must be installed first to build a Kafka cluster environment

zookeeper配置文件修改(三台配置一样)

[root@localhost ~]# cd /usr/local/kafka/
[root@localhost kafka]# vim config/zookeeper.properties

dataDir=/tmp/zookeeper				## 主要用来配置zookeeper server数据的存放路径
clientPort=2181						## zookeeper服务端口
tickTime=2000						## zookeeper客户端与服务器之间的心跳时间。默认值为2000毫秒,即2秒
initLimit=10						## Follower连接到Leader并同步数据的最大时间
syncLimit=5							## Follower同步Leader的最大时间
maxClientCnxns=0					## 客户端最大连接数,设置0或者不设置表示取消连接数限制
admin.enableServer=false			## 禁用 Admin Server
server.0=192.168.1.13:2888:3888		## 配置zookeeper群集
server.1=192.168.1.108:2888:3888
server.2=192.168.1.143:2888:3888

创建myid文件(三台机器都要操作)

192.168.1.13机器上

[root@localhost kafka]# mkdir -p /opt/zookeeper;echo "0" > /tmp/zookeeper/myid

192.168.1.108机器上

[root@localhost kafka]# mkdir -p /opt/zookeeper;echo "1" > /tmp/zookeeper/myid

192.168.1.143机器上

[root@localhost kafka]# mkdir -p /opt/zookeeper;echo "2" > /tmp/zookeeper/myid

创建system启动文件

[root@localhost kafka]# vim /usr/lib/systemd/system/zookeeper.service

[Unit]
Description=zookeeper
After=network.target
Wants=network.target

[Service]
Type=simple
Environment="PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/app/idk/bin"
User=root
Group=root
ExecStart=/usr/local/kafka/bin/zookeeper-server-start.sh /usr/local/kafka/config/zookeeper.properties
Restart=always

[Install]
WantedBy=multi-user.target

启动zookeeper

[root@localhost kafka]# systemctl enable zookeeper.service
[root@localhost kafka]# systemctl start zookeeper.service

4. Installation of Kafka

kafka配置文件修改

[root@localhost ~]# cd /usr/local/kafka/
[root@localhost kafka]# vim config/server.properties (copy directly to the other two)

broker.id=1												## 第一台1,第二台为2 第三台为3
listeners=PLAINTEXT://:9092								## 监听IP地址和端口
advertised.listeners=PLAINTEXT://192.168.1.13:9092		## 配置kafka的broker ip
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=404857600
log.dirs=/tmp/kafka-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=192.168.1.13:2181,192.168.1.108:2181,192.168.1.143:2181	##连接zookeeper
zookeeper.connection.timeout.ms=18000
group.initial.rebalance.delay.ms=0
## 修改192.168.1.108的配置
broker.id=2
listeners=PLAINTEXT://:9092
advertised.listeners=PLAINTEXT://192.168.1.108:9092

修改192.168.1.143的配置
broker.id=3
listeners=PLAINTEXT://:9092
advertised.listeners=PLAINTEXT://192.168.1.143:9092
## 修改JVM参数 (根据需求修改)
kafka_heap_opts:指定堆大小,默认是1GB

[root@localhost kafka]# vim bin/kafka-server-start.sh 
if [ "x$KAFKA_HEAP_OPTS" = "x" ]; then
    export KAFKA_HEAP_OPTS="-Xmx2G -Xms2G -XX:MaxDirectMemorySize=1G"			## 修改这一行即可
fi

创建system启动文件

[root@localhost kafka]# vim /usr/lib/systemd/system/kafka.service

[Unit]
Description=kafka
After=network.target
Wants=network.target

[Service]
Type=simple
Environment="PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/app/idk/bin"
User=root
Group=root
ExecStart=/usr/local/kafka/bin/kafka-server-start.sh /usr/local/kafka/config/server.properties
Restart=always

[Install]
WantedBy=multi-user.target

启动zookeeper
[root@localhost kafka]# systemctl enable kafka.service
[root@localhost kafka]# systemctl start kafka.service

Five, stepping on the pit

The service can't start, can't find Java

/usr/local/kafka/bin/kafka-run-class.sh: line 309: exec: java: not found

第一种: 修改配置文件中的Java环境

## 修改kafka-run-class.sh 配置文件
[root@localhost kafka]# vim bin/kafka-run-class.sh
## 找到如下:
# Which java to use
if [ -z "$JAVA_HOME" ]; then
  JAVA="/usr/local/jdk/bin/java"   				## 修改到绝对路径
else
  JAVA="$JAVA_HOME/bin/java"
fi

第二种: 做个软连接

[root@localhost kafka]# echo $PATH
/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin:/home/laserx/.local/bin:/home/laserx/bin

[root@localhost kafka]# ln -s /usr/local/jdk1.8.0_251/bin/java /usr/local/bin/java

insert image description here

Guess you like

Origin blog.csdn.net/anyRTC/article/details/127390405