Kafka cluster deployment and construction

1. Brief introduction

Kafka is a distributed Message Queue based on the publish/subscribe model, which is mainly used in
the real-time processing of big data.
Insert picture description here
Insert picture description here

Two, cluster planning

Host centos7-1 centos7-2 centos7-3 centos7-4
kafka
zookeeper

Three, cluster deployment

  • Unzip the installation package
  • Modify the configuration file
vim  conf/server.properties
#broker 的全局唯一编号,不能重复
broker.id=0

#删除 topic 功能使能
delete.topic.enable=true

#用来处理磁盘 IO 的现成数量
num.io.threads=8

#发送套接字的缓冲区大小
socket.send.buffer.bytes=102400

#接收套接字的缓冲区大小
socket.receive.buffer.bytes=102400

#请求套接字的缓冲区大小
socket.request.max.bytes=104857600

#kafka 运行日志存放的路径,kafka是暂存数据的,这里不仅有运行的日志,还有以主题名称-分区号存储的数据
log.dirs=/data/kafka/data/kafka-logs

#topic 在当前 broker 上的分区个数
num.partitions=1

#用来恢复和清理 data 下数据的线程数量
num.recovery.threads.per.data.dir=1

#segment 文件保留的最长时间,超时将被删除
log.retention.hours=168

# The interval at which log segments are checked to see if they can be deleted according
# to the retention policies
log.retention.check.interval.ms=300000

#配置连接 Zookeeper 集群地址
zookeeper.connect=centos7-1:2181,centos7-2:2181,centos7-3:2181

#连接zookeeper超时时间
zookeeper.connection.timeout.ms=18000

# However, in production environments the default value of 3 seconds is more suitable as this will help to avoid unnecessary, and potentially expensive, rebalances during application startup.
group.initial.rebalance.delay.ms=0
  • Kafka cluster does not distinguish between server and client. The reason why the cluster can be identified is that they are all managed by the same zookeeper and registered with the only broker.id on zk, so the broker.id of the other three servers is modified Just one click, the cluster is set up.

Guess you like

Origin blog.csdn.net/weixin_46122692/article/details/109250586