Kafka cluster installation and deployment (comes Zookeeper)

 

 

kafka Profile

kafka official website: http://kafka.apache.org/
kafka download page: http://kafka.apache.org/downloads
kafka Configuration Quick Start: http://kafka.apache.org/quickstart
new version comes kafka there zookeeper, zookeeper this article recorded using the built structures kafka cluster.

In order to achieve high availability, high throughput, kafka deployed in a cluster, are mounted deployed on three servers, because kafka dependent on the zookeeper, if you want to use Kafka, you must install ZK, consumer bias in the information Kafka, kafka cluster, topic information will be stored in the ZK.

Here the use of kafka built zookeeper, is deployed in a cluster, without the need to install separate zookeeper.

 

First, let's look at a few basic messaging system terminology:

• Kafka summarize the message to the topic as a unit.
• Kafka program topic will be the dissemination of information Producers.
• The book topics and consumer information program to become Consumer.
• Kafka run in a cluster, may consist of one or more services, each is called a Broker.
Producers by Kafka network sends a message to the cluster, the cluster providing a message to the consumer, as shown below:

The client and server communicate via TCP protocol. Kafka provides a Java client, and provides support for multiple languages.

 

Preparing the Environment

kafka version: kafka_2.11-1.0.0.tgz
three hosts IP:172.16.10.91、172.16.10.92、172.16.10.93

 

Configuring installation work

1, kafka installation

  After downloading unzip kafka, extract it to under / usr / local directory, you can delete the archive:

extracting tar xzvf kafka_2.11-1.0.0.tgz #

  Kafka into the config directory 

 

  Custom directory:

First, the new kafka logs directory and data directory zookeeper, because these two default on tmp directory, and tmp directory content will be lost with restart, so we customize the following directories:

. mkdir / usr / local / kafka_2 11 - 1.0 . 0 / zookeeper # Create a zookeeper data directory 
mkdir / usr / local / kafka_2. 11 - 1.0 . 0 / log # Create a log directory 
mkdir / usr / local / kafka_2. 11 - 1.0 . 0 / log / zookeeper zookeeper create log directory # 
mkdir . / usr / local / kafka_2 11 - 1.0 . 0 / log / kafka kafka # create a log directory

 

2, ZooKeeper configuration

  >  Modify zookeeper.properties

Enter config directory, modify the following key configuration, 3 servers zookeeper.properties profiles are the same

# Modify zookeeper data directory custom 
dataDir = / usr / local / kafka_2. . 11 - 1.0 . 0 / zookeeper 

# modify zookeeper log directory custom 
dataLogDir = / usr / local / kafka_2. . 11 - 1.0 . 0 / log / ZooKeeper 

# ports 
the clientPort = 2181 

# Zhushidiao 
#maxClientCnxns = 0 

# set the connection parameters, add the following configuration 
tickTime = 2000 # zk is the basic unit of time, in milliseconds 
initLimit = 10 # Leader-Follower 10 * initial communication time tickTime 
syncLimit = . 5 # Leader-Follower synchronous communication time tickTime * 5

# Set the address service broker Id 
Server. 0 = 172.16 . 10.91 : 2888 : 3888 
Server. . 1 = 172.16 . 10.92 : 2888 : 3888 
. Server 2 = 172.16 . 10.93 : 2888 : 3888

 

  ZooKeeper configuration data directory Add myid

Add myid file in the data directory zookeeper each server [/usr/local/kafka_2.11-1.0.0/zookeeper], written service broker.id property value                                                                   

Such as catalogs here is /usr/local/kafka_2.11-1.0.0/zookeeper

Note: The three servers should add myid file, but the content is not the same value, the specific relationship, see the following figure:

 

For example: The first service broker.id 0 to execute the following command in the directory [/usr/local/kafka_2.11-1.0.0/zookeeper], the contents of the file will generate myid

[root@mmc config]# echo 0 > myid

 

 

3、kafka配置

   进入config目录下,修改server.properties文件

注:3臺服務器都要配置,其中【broker.id、advertised.listeners】不一樣,其它配置都一樣

############################# Server Basics #############################
# broker 的全局唯一编号,不能重复
broker.id=1


############################# Socket Server Settings #############################
# 配置监听,修改为本机ip
advertised.listeners=PLAINTEXT://172.16.10.91:9092   

# 处理网络请求的线程数量,默认
num.network.threads=3

# 用来处理磁盘IO的线程数量,默认
num.io.threads=8

# 发送套接字的缓冲区大小,默认
socket.send.buffer.bytes=102400

# 接收套接字的缓冲区大小,默认
socket.receive.buffer.bytes=102400

# 请求套接字的缓冲区大小,默认
socket.request.max.bytes=104857600


############################# Zookeeper #############################
# 配置三台服务zookeeper连接地址
zookeeper.connect=172.16.10.91:2181,172.16.10.92:2181,172.16.10.93:2181


############################# Log Basics #############################
# kafka 运行日志存放路径
log.dirs=/usr/local/kafka_2.11-1.0.0/log/kafka

# topic 在当前broker上的分片个数,与broker保持一致
num.partitions=3

# 用来恢复和清理data下数据的线程数量,默认
num.recovery.threads.per.data.dir=1


############################# Log Retention Policy #############################
# segment文件保留的最长时间,超时将被删除,默认
log.retention.hours=168

# 滚动生成新的segment文件的最大时间,默认
log.roll.hours=168

 

 4、kafka启动

  kafka启动时先启动zookeeper,再启动kafka;关闭时相反,先关闭kafka,再关闭zookeeper

  • 進入kafka根目錄
[root@localhost kafka_2.11-1.0.0]# cd /usr/local/kafka_2.11-1.0.0/

 

  • 啟動zookeeper

  輸入命令:

[root@localhostkafka_2.11-1.0.0]# bin/zookeeper-server-start.sh config/zookeeper.properties &

#後臺啟動命令:
[root@localhostkafka_2.11-1.0.0]# nohup bin/zookeeper-server-start.sh config/zookeeper.properties >log/zookeeper/zookeeper.log 2>1 &

 

  查看log/zookeeper/zookeeper.log 文件,第1、2臺服務器啟動的時候,會報錯Connection refused,因為另外1、2臺服務器還沒啟動,先不用管它

 

  查看log/zookeeper/zookeeper.log 文件,等3臺服務都啟動完後,就不在報錯了

 

  • 啟動kafka

  輸入命令:

[root@localhostkafka_2.11-1.0.0]# bin/kafka-server-start.sh config/server.properties &

#後臺啟動命令:
[root@localhostkafka_2.11-1.0.0]# nohup bin/kafka-server-start.sh config/server.properties >log/kafka/kafka.log 2>1 &

  查看log/kafka/kafka.log 文件,啟動過程中沒有報錯,且有以下信息輸出,說明啟動成功

 

 

 5、kafka测试验证

  分别启动三台机器的zookeeper,三台机器的zookeeper都启动成功后,再分别启动三台机器的kafka。

  • 進入kafka根目錄
[root@localhost kafka_2.11-1.0.0]# cd /usr/local/kafka_2.11-1.0.0/
  • 在某臺機器創建topic,名稱為test
[root@localhost kafka_2.11-1.0.0]# bin/kafka-topics.sh -create --zookeeper 172.16.10.91:2181,172.16.10.92:2181,172.16.10.93:2181 -replication-factor 3 --partitions 3 --topic test
命令解析: 
--create:          #指定创建topic动作 
--topic:          #指定新建topic的名称
--zookeeper:        #指定kafka连接zk的连接url,该值和server.properties文件中的配置项{zookeeper.connect}一样
--partitions 3       #指定当前创建的kafka分区数量,默认为1个
--replication-factor 3  #指定每个分区的副本集数量,不能大于 broker 节点数量,多了也没用,1个节点放>=2个副本挂了都完蛋。

 

  • 查看topic,确认topic创建成功
[root@localhost kafka_2.11-1.0.0]# bin/kafka-topics.sh --list --zookeeper 172.16.10.91:2181

 

  • 查看topic,详细信息
[root@localhost kafka_2.11-1.0.0]# bin/kafka-topics.sh --describe --zookeeper 172.16.10.93:2181 --topic test

可以描述为:Topic分区数/副本数/副本Leader/副本ISR等信息: 

  “leader”:该节点负责该分区的所有的读和写,每个节点的leader都是随机选择的。
  “replicas”:备份的节点列表,无论该节点是否是leader或者目前是否还活着,只是显示。
  “isr”:同步备份”的节点列表,也就是活着的节点并且正在同步leader
     其中Replicas和Isr中的1,2,0就对应着3个broker他们的broker.id属性!

 

  • 在某臺服務器上创建生产者
[root@localhost kafka_2.11-1.0.0]# bin/kafka-console-producer.sh --broker-list 172.16.10.91:9092,172.16.10.92:9092,172.16.10.93:9092 --topic test

 

  • 在另外两台服務器创建消费者
[root@localhost kafka_2.11-1.0.0]# bin/kafka-console-consumer.sh --zookeeper 172.16.10.91:2181,172.16.10.92:2181,172.16.10.93:2181 --topic test --from-beginning

  

  • 生產者服務器截圖

 

  • 消費者服務器截圖

 

  • 修改topic信息
bin/kafka-topics.sh --zookeeper 192.168.187.146:2181 --alter --topic test --config max.message.bytes=128000
bin/kafka-topics.sh --zookeeper 192.168.187.146:2181 --alter --topic test --delete-config max.message.bytes
bin/kafka-topics.sh --zookeeper 192.168.187.146:2181 --alter --topic test --partitions 10 
bin/kafka-topics.sh --zookeeper 192.168.187.146:2181 --alter --topic test --partitions 3 ## Kafka分区数量只允许增加,不允许减少

 

  • 删除topic

注意,只是删除Topiczk的元数据,日志数据仍需手动删除

[root@localhost kafka_2.11-1.0.0]# bin/kafka-topics.sh -delete --zookeeper 172.16.10.91:2181,172.16.10.92:2181,172.16.10.93:2181 --topic test

Note: This will have no impact if delete.topic.enable is not set to true.## 默认情况下,删除是标记删除,没有实际删除这个Topic;如果运行删除Topic,两种方式:
  方式一:通过delete命令删除后,手动将本地磁盘以及zk上的相关topic的信息删除即可
  方式二:配置server.properties文件,给定参数delete.topic.enable=true,重启kafka服务,此时执行delete命令表示允许进行Topic的删除

 

 6、kafka测试集群的容错能力
  Kafka是支持容错的,下面我们来对Kafka的容错性进行测试,测试步骤如下

(1).查看topic的详细信息,观察那个blocker的角色是leader,那些blocker的角色是follower

[hadoop@Master ~]$ kafka-topics.sh --describe --zookeeper localhost:2181 --topic my-replicated-topic
Topic:my-replicated-topic   PartitionCount:1    ReplicationFactor:3 Configs:
    Topic: my-replicated-topic  Partition: 0    Leader: 2   Replicas: 2,3,1 Isr: 2,3,1

(2).手工kill掉任意一个状态是follower的borker,测试生成和消费信息是否正确

步骤1中可以看到 2 为leader,1 和 3为 follower,将follower为1的进程kill掉

 

启动生产和消费者测试信息是否正确

 

 

结论:kill掉任意一个状态是follower的broker,生成和消费信息正确,不受任何影响

 

(3).手工kill掉状态是leader的borker,测试生产和消费的信息是否正确

borker2的角色为leader,将它kill掉,borker 3变成了leader


启动生产和消费者测试信息是否正确

 

 

结论:kill掉状态是leader的borker,生产和消费的信息正确

总结:不管当前状态的borker是leader还是follower,当我们kill掉后,只要有一个borker能够正常使用,则消息仍然能够正常的生产和发送。即Kafka的容错性是有保证的!

 

转载于:https://www.cnblogs.com/caoweixiong/p/11060533.html

Guess you like

Origin blog.csdn.net/weixin_34395205/article/details/93286165