Apache Kafka 集群搭建与使用

15356343-c73fcfc1fbf9e610.png
1.png

Apache Kafka 集群搭建与使用

继续之前的 Apache Kafka 部署与启动 ,单机的kafka的topic的创建,发送消息和接收消息,单播和多播消息,以及本次的集群搭建和使用。


一、启动zookeeper
[root@node-100 zookeeper]# cd zookeeper-3.4.12/
[root@node-100 zookeeper-3.4.12]# ls
bin        conf     dist-maven  ivysettings.xml  lib          logs        README.md             recipes  zookeeper-3.4.12.jar      zookeeper-3.4.12.jar.md5   zookeeper.out
build.xml  contrib  docs        ivy.xml          LICENSE.txt  NOTICE.txt  README_packaging.txt  src      zookeeper-3.4.12.jar.asc  zookeeper-3.4.12.jar.sha1
[root@node-100 zookeeper-3.4.12]# bin/zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/zookeeper-3.4.12/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@node-100 zookeeper-3.4.12]# bin/zkCli.sh -server 192.168.5.100:2181

客户端连接成功后,查看之前的节点信息

Connecting to 192.168.5.100:2181
2019-01-22 22:19:21,790 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.12-e5259e437540f349646870ea94dc2658c4e44b3b, built on 03/27/2018 03:55 GMT
2019-01-22 22:19:21,794 [myid:] - INFO  [main:Environment@100] - Client environment:host.name=node-100
2019-01-22 22:19:21,794 [myid:] - INFO  [main:Environment@100] - Client environment:java.version=1.8.0_191
2019-01-22 22:19:21,796 [myid:] - INFO  [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2019-01-22 22:19:21,796 [myid:] - INFO  [main:Environment@100] - Client environment:java.home=/usr/local/java/jdk1.8.0_191/jre
2019-01-22 22:19:21,796 [myid:] - INFO  [main:Environment@100] - Client environment:java.class.path=/usr/local/zookeeper/zookeeper-3.4.12/bin/../build/classes:/usr/local/zookeeper/zookeeper-3.4.12/bin/../build/lib/*.jar:/usr/local/zookeeper/zookeeper-3.4.12/bin/../lib/slf4j-log4j12-1.7.25.jar:/usr/local/zookeeper/zookeeper-3.4.12/bin/../lib/slf4j-api-1.7.25.jar:/usr/local/zookeeper/zookeeper-3.4.12/bin/../lib/netty-3.10.6.Final.jar:/usr/local/zookeeper/zookeeper-3.4.12/bin/../lib/log4j-1.2.17.jar:/usr/local/zookeeper/zookeeper-3.4.12/bin/../lib/jline-0.9.94.jar:/usr/local/zookeeper/zookeeper-3.4.12/bin/../lib/audience-annotations-0.5.0.jar:/usr/local/zookeeper/zookeeper-3.4.12/bin/../zookeeper-3.4.12.jar:/usr/local/zookeeper/zookeeper-3.4.12/bin/../src/java/lib/*.jar:/usr/local/zookeeper/zookeeper-3.4.12/bin/../conf:.:/usr/local/java/jdk1.8.0_191/lib:/usr/local/java/jdk1.8.0_191/jre/lib:
2019-01-22 22:19:21,796 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2019-01-22 22:19:21,796 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2019-01-22 22:19:21,796 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA>
2019-01-22 22:19:21,796 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux
2019-01-22 22:19:21,796 [myid:] - INFO  [main:Environment@100] - Client environment:os.arch=amd64
2019-01-22 22:19:21,796 [myid:] - INFO  [main:Environment@100] - Client environment:os.version=3.10.0-327.el7.x86_64
2019-01-22 22:19:21,796 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=root
2019-01-22 22:19:21,796 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/root
2019-01-22 22:19:21,796 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/usr/local/zookeeper/zookeeper-3.4.12
2019-01-22 22:19:21,797 [myid:] - INFO  [main:ZooKeeper@441] - Initiating client connection, connectString=192.168.5.100:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@69d0a921
Welcome to ZooKeeper!
JLine support is enabled
2019-01-22 22:19:21,887 [myid:] - INFO  [main-SendThread(192.168.5.100:2181):ClientCnxn$SendThread@1028] - Opening socket connection to server 192.168.5.100/192.168.5.100:2181. Will not attempt to authenticate using SASL (unknown error)
2019-01-22 22:19:21,985 [myid:] - INFO  [main-SendThread(192.168.5.100:2181):ClientCnxn$SendThread@878] - Socket connection established to 192.168.5.100/192.168.5.100:2181, initiating session
2019-01-22 22:19:22,015 [myid:] - INFO  [main-SendThread(192.168.5.100:2181):ClientCnxn$SendThread@1302] - Session establishment complete on server 192.168.5.100/192.168.5.100:2181, sessionid = 0x100000793a20000, negotiated timeout = 30000

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
[zk: 192.168.5.100:2181(CONNECTED) 0] ls /
[cluster, controller_epoch, brokers, zookeeper, admin, isr_change_notification, consumers, log_dir_event_notification, latest_producer_id_block, config]
[zk: 192.168.5.100:2181(CONNECTED) 1] 
二、启动kafka
[root@node-100 kafka]# cd kafka_2.12-2.1.0/
[root@node-100 kafka_2.12-2.1.0]# ls
bin  config  data  libs  LICENSE  logs  NOTICE  site-docs
[root@node-100 kafka_2.12-2.1.0]# bin/kafka-server-start.sh -daemon config/server.properties 
[root@node-100 kafka_2.12-2.1.0]# 
三、创建主题

创建一个名字为 test 的Topic,这个topic只有一个partition,并且备份因子也设置为1:

[root@node-100 kafka_2.12-2.1.0]# bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test

去zk看下这个topic

[zk: 192.168.5.100:2181(CONNECTED) 2] ls /brokers/

ids      topics   seqid
[zk: 192.168.5.100:2181(CONNECTED) 2] ls /brokers/topics
[test]

也可以通过以下命令来查看kafka中目前存在的topic

[root@node-100 kafka_2.12-2.1.0]#  bin/kafka-topics.sh --list --zookeeper localhost:2181
test
[root@node-100 kafka_2.12-2.1.0]# 

除了我们通过手工的方式创建Topic,我们可以配置broker,当producer发布一个消息某个指定的Topic,但是这个Topic并不存在时,就自动创建。

四、发送消息

kafka自带了一个producer命令客户端,可以从本地文件中读取内容,或者我们也可以以命令行中直接输入内容,并将这些内容以消息的形式发送到kafka集群中。在默认情况下,每一个行会被当做成一个独立的消息。
首先我们要运行发布消息的脚本,然后在命令中输入要发送的消息的内容:

[root@node-100 kafka_2.12-2.1.0]# bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
>test msg
五、消费消息

对于consumer,kafka同样也携带了一个命令行客户端,会将获取到内容在命令中进行输出。

新版的命令(2.0):

[root@node-100 kafka_2.12-2.1.0]# bin/kafka-console-consumer.sh --bootstrap-server localhost:9092  --consumer-property group.id=testGroup --consumer-property client.id=consumer-1  --topic test
test msg

老版本的命令(1.0):

 bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test   --from-beginning

如果是通过不同的终端窗口来运行以上的命令,你将会看到在producer终端输入的内容,很快就会在consumer的终端窗口上显示出来。

以上所有的命令都有一些附加的选项;当我们不携带任何参数运行命令的时候,将会显示出这个命令的详细用法。
比如说:

[root@node-100 kafka_2.12-2.1.0]# bin/kafka-console-producer.sh
Read data from standard input and publish it to Kafka.
Option                                   Description                            
------                                   -----------                            
--batch-size <Integer: size>             Number of messages to send in a single 
                                           batch if they are not being sent     
                                           synchronously. (default: 200)        
--broker-list <String: broker-list>      REQUIRED: The broker list string in    
                                           the form HOST1:PORT1,HOST2:PORT2.    
--compression-codec [String:             The compression codec: either 'none',  
  compression-codec]                       'gzip', 'snappy', 'lz4', or 'zstd'.  
                                           If specified without value, then it  
                                           defaults to 'gzip'                   
--line-reader <String: reader_class>     The class name of the class to use for 
                                           reading lines from standard in. By   
                                           default each line is read as a       
                                           separate message. (default: kafka.   
                                           tools.                               
                                           ConsoleProducer$LineMessageReader)   
--max-block-ms <Long: max block on       The max time that the producer will    
  send>                                    block for during a send request      
                                           (default: 60000)                     
--max-memory-bytes <Long: total memory   The total memory used by the producer  
  in bytes>                                to buffer records waiting to be sent 
                                           to the server. (default: 33554432)   
--max-partition-memory-bytes <Long:      The buffer size allocated for a        
  memory in bytes per partition>           partition. When records are received 
                                           which are smaller than this size the 
                                           producer will attempt to             
                                           optimistically group them together   
                                           until this size is reached.          
                                           (default: 16384)                     
--message-send-max-retries <Integer>     Brokers can fail receiving the message 
                                           for multiple reasons, and being      
                                           unavailable transiently is just one  
                                           of them. This property specifies the 
                                           number of retires before the         
                                           producer give up and drop this       
                                           message. (default: 3)                
--metadata-expiry-ms <Long: metadata     The period of time in milliseconds     
  expiration interval>                     after which we force a refresh of    
                                           metadata even if we haven't seen any 
                                           leadership changes. (default: 300000)
--producer-property <String:             A mechanism to pass user-defined       
  producer_prop>                           properties in the form key=value to  
                                           the producer.                        
--producer.config <String: config file>  Producer config properties file. Note  
                                           that [producer-property] takes       
                                           precedence over this config.         
--property <String: prop>                A mechanism to pass user-defined       
                                           properties in the form key=value to  
                                           the message reader. This allows      
                                           custom configuration for a user-     
                                           defined message reader.              
--request-required-acks <String:         The required acks of the producer      
  request required acks>                   requests (default: 1)                
--request-timeout-ms <Integer: request   The ack timeout of the producer        
  timeout ms>                              requests. Value must be non-negative 
                                           and non-zero (default: 1500)         
--retry-backoff-ms <Integer>             Before each retry, the producer        
                                           refreshes the metadata of relevant   
                                           topics. Since leader election takes  
                                           a bit of time, this property         
                                           specifies the amount of time that    
                                           the producer waits before refreshing 
                                           the metadata. (default: 100)         
--socket-buffer-size <Integer: size>     The size of the tcp RECV size.         
                                           (default: 102400)                    
--sync                                   If set message send requests to the    
                                           brokers are synchronously, one at a  
                                           time as they arrive.                 
--timeout <Integer: timeout_ms>          If set and the producer is running in  
                                           asynchronous mode, this gives the    
                                           maximum amount of time a message     
                                           will queue awaiting sufficient batch 
                                           size. The value is given in ms.      
                                           (default: 1000)                      
--topic <String: topic>                  REQUIRED: The topic id to produce      
                                           messages to.                         
[root@node-100 kafka_2.12-2.1.0]# 

还有一些其他命令如下:

查看组名

[root@node-100 kafka_2.12-2.1.0]#  bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --list 
testGroup
[root@node-100 kafka_2.12-2.1.0]# 

查看消费者的消费偏移量

[root@node-100 kafka_2.12-2.1.0]# bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --describe --group testGroup

TOPIC           PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG             CONSUMER-ID                                     HOST            CLIENT-ID
test            0          20              20              0               consumer-1-86bc402b-40d3-4aad-99b5-ce89f2603f29 /192.168.5.100  consumer-1
[root@node-100 kafka_2.12-2.1.0]# 

消费多主题

bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --whitelist "test|test-2"
六、单播消费

一条消息只能被某一个消费者消费的模式,类似queue模式,只需让所有消费者在同一个消费组里即可
分别在两个客户端执行如下消费命令,然后往主题里发送消息,结果只有一个客户端能收到消息。

producer:

[root@node-100 kafka_2.12-2.1.0]# bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
>test msg
>

consumer-1:

bin/kafka-console-consumer.sh --bootstrap-server localhost:9092  --consumer-property group.id=testGroup --topic test 
test msg

consumer-2:

bin/kafka-console-consumer.sh --bootstrap-server localhost:9092  --consumer-property group.id=testGroup --topic test 

这种情况,自己开一个producer,两个consumer,参数:group.id=testGroup --topic test 相同,可以在producer端发送消息,然后只有一个consumer可以收到消息。

七、多播消费

一条消息能被多个消费者消费的模式,类似publish-subscribe模式费,针对Kafka同一条消息只能被同一个消费组下的某一个消费者消费的特性,要实现多播只要保证这些消费者属于不同的消费组即可。我们再增加一个消费者,该消费者属于testGroup-2消费组,结果两个客户端都能收到消息。

producer:

[root@node-100 kafka_2.12-2.1.0]# bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
>test 123
>

consumer-1(group.id=testGroup):

[root@node-100 kafka_2.12-2.1.0]# bin/kafka-console-consumer.sh --bootstrap-server localhost:9092  --consumer-property group.id=testGroup --consumer-property --topic test
test 123

consumer-2(group.id=testGroup-2):

[root@node-100 kafka_2.12-2.1.0]# bin/kafka-console-consumer.sh --bootstrap-server localhost:9092  --consumer-property group.id=testGroup-2 --topic test 
test 123

集群的搭建与部署

对于kafka来说,一个单独的broker意味着kafka集群中只有一个接点。要想增加kafka集群中的节点数量,只需要多启动几个broker实例即可。为了有更好的理解,现在我们在一台机器上同时启动三个broker实例。

首先,我们需要建立好其他2个broker的配置文件:

[root@node-100 kafka_2.12-2.1.0]# cp config/server.properties config/server-1.properties
[root@node-100 kafka_2.12-2.1.0]# cp config/server.properties config/server-2.properties
[root@node-100 kafka_2.12-2.1.0]# 

修改配置文件的内容分别如下:

config/server-1.properties:

broker.id=1
port=9093
log.dirs=/usr/local/kafka/kafka_2.12-2.1.0/data/kafka-logs-1

config/server-2.properties:

broker.id=2
port=9094
log.dirs=/usr/local/kafka/kafka_2.12-2.1.0/data/kafka-logs-2

之前的config/server.properties:

broker.id=0
port=9092
log.dirs=/usr/local/kafka/kafka_2.12-2.1.0/data/kafka-logs

broker.id属性在kafka集群中必须要是唯一的。我们需要重新指定port和log目录,因为我们是在同一台机器上运行多个实例。如果不进行修改的话,consumer只能获取到一个instance实例的信息,或者是相互之间的数据会被影响。

目前我们已经有一个zookeeper实例和一个broker实例在运行了,现在我们只需要在启动2个broker实例即可:

[root@node-100 kafka_2.12-2.1.0]# bin/kafka-server-start.sh -daemon config/server-1.properties
[root@node-100 kafka_2.12-2.1.0]# bin/kafka-server-start.sh -daemon config/server-2.properties

现在我们创建一个新的topic,备份因子设置为3:

[root@node-100 kafka_2.12-2.1.0]# bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 3 --partitions 1 --topic my-replicated-topic
Created topic "my-replicated-topic".
[root@node-100 kafka_2.12-2.1.0]# 

现在我们已经有了集群,并且创建了一个3个备份因子的topic,但是到底是哪一个broker在为这个topic提供服务呢(因为我们只有一个分区,所以肯定同时只有一个broker在处理这个topic)?

[root@node-100 kafka_2.12-2.1.0]#  bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic my-replicated-topic
Topic:my-replicated-topic   PartitionCount:1    ReplicationFactor:3 Configs:
    Topic: my-replicated-topic  Partition: 0    Leader: 0   Replicas: 0,1,2 Isr: 0,1,2
[root@node-100 kafka_2.12-2.1.0]# 

以下是输出内容的解释,第一行是所有分区的概要信息,之后的每一行表示每一个partition的信息。因为目前我们只有一个partition,因此关于partition的信息只有一行。

  • leader节点负责给定partition的所有读写请求。
  • replicas 表示某个partition在哪几个broker上存在备份。不管这个几点是不是leader,甚至这个节点挂了,也会列出来。
  • isr 是replicas的一个子集,它只列出当前还存活着的,并且备份了该partition的节点。

现在我们的案例中,0号节点是leader,即使用server.properties启动的那个进程。
我们可以运行相同的命令查看之前创建的名称为test的topic

[root@node-100 kafka_2.12-2.1.0]# bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic test 
Topic:test  PartitionCount:1    ReplicationFactor:1 Configs:
    Topic: test Partition: 0    Leader: 0   Replicas: 0 Isr: 0
[root@node-100 kafka_2.12-2.1.0]# 

之前设置了topic的partition数量为1,备份因子为1,因此显示就如上所示了。

现在我们向新建的topic中发送一些message:

[root@node-100 kafka_2.12-2.1.0]#  bin/kafka-console-producer.sh --broker-list localhost:9092 --topic my-replicated-topic
>test 001
>test 002

现在开始消费:

[root@node-100 kafka_2.12-2.1.0]# bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --from-beginning --topic my-replicated-topic
test 001
test 002

现在我们来测试下选举,因为broker0目前是leader,所以我们要将其kill掉

[root@node-100 kafka_2.12-2.1.0]#  ps -ef | grep server.properties
[root@node-100 kafka_2.12-2.1.0]#  kill -9 2732

现在再执行命令:

[root@node-100 kafka_2.12-2.1.0]# bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic my-replicated-topic
Topic:my-replicated-topic   PartitionCount:1    ReplicationFactor:3 Configs:
    Topic: my-replicated-topic  Partition: 0    Leader: 1   Replicas: 0,1,2 Isr: 1,2
[root@node-100 kafka_2.12-2.1.0]# 

我们可以看到,leader节点已经变成了broker1。要注意的是,在Isr中,已经没有了broker 0号节点。leader的选举也是从ISR(in-sync replica)中进行的。

再次将broker 0启动,查看topic信息

[root@node-100 kafka_2.12-2.1.0]# bin/kafka-server-start.sh -daemon config/server.properties 
[root@node-100 kafka_2.12-2.1.0]# bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic my-replicated-topic
Topic:my-replicated-topic   PartitionCount:1    ReplicationFactor:3 Configs:
    Topic: my-replicated-topic  Partition: 0    Leader: 1   Replicas: 0,1,2 Isr: 1,2,0
[root@node-100 kafka_2.12-2.1.0]# 

在Isr中,已经又有了broker 0号节点,且leader还是broker 1。

看下zk中的partition的state信息:

[zk: 192.168.5.100:2181(CONNECTED) 11] get  /brokers/topics/my-replicated-topic/partitions/0/state
{"controller_epoch":3,"leader":1,"version":1,"leader_epoch":2,"isr":[1,2]}
cZxid = 0xd3
ctime = Tue Jan 22 23:24:44 CST 2019
mZxid = 0x136
mtime = Wed Jan 23 00:08:32 CST 2019
pZxid = 0xd3
cversion = 0
dataVersion = 3
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 74
numChildren = 0
[zk: 192.168.5.100:2181(CONNECTED) 12] 

跟我们看到的topic信息是一样的。


还有一些其他小问题没有补充,下次继续:)。
如有问题,欢迎指正:)

猜你喜欢

转载自blog.csdn.net/weixin_33790053/article/details/87255671