Kafka本机搭建、常用命令、以及Rest模式||Kafka在zookeeper中的注册关系

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/roczheng1990/article/details/54599095

Kafka中的topic、offset、partition

折腾了好几天终于实现了kafka本机的搭建,现将一些要点及常用命令贴出来供大家参考:
1.下载kafka安装包
2.复制到指定文件夹、解压
3.下面就是简单的测试了,我用的是0.8版本,基本不用配置什么,解压完就能使用。

(1)打开zookeeper:

bin/zookeeper-server-start.sh config/zookeeper.properties

(2)打开kafka:

bin/kafka-server-start.sh config/server.properties

(3)创建topic:

bin/kafka-create-topic.sh   --replica 2 --partition 8 --topic test  --zookeeper 127.0.0.1:2181

(4)创建生产者:

bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test

(5)创建消费者:

bin/kafka-console-consumer.sh --zookeeper 127.0.0.1:2181 --topic test --from-beginning

4.关闭kafka

bin/kafka-server-stop.sh

5.关闭zookeeper

bin/zookeeper-server-stop.sh

6.列出topic

bin/kafka-list-topic.sh --zookeeper localhost:2181

7.登录zookeeper

bin/zookeeper-shell.sh host_name:2181

需要安装包的可以在网上找找,找不到的也可以给我邮件:[email protected]

**windows下的操作

bin\windows\zookeeper-server-start.bat config\zookeeper.properties

bin\windows\kafka-server-start.bat config\server.properties

bin\windows\kafka-topics.bat --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic Test

bin\windows\kafka-console-producer.bat --broker-list localhost:9092 --topic Test

bin\windows\kafka-console-consumer.bat --zookeeper localhost:2181 --topic Test --from-beginning

的例子都是在用kafkaRest来实现的,Rest的API学习地址:

`http://docs.confluent.io/2.0.1/quickstart.html`

但是Rest也存在一些问题,比如在数据量大的情况下,就会受到网络带宽的限制。

这里用的是Curl

例如:

curl是利用URL语法在命令行方式下工作的开源文件传输工具。它被广泛应用在Unix、多种Linux发行版中,并且有DOS和Win32、Win64下的移植版本

# Get a list of topics
$ curl "http://localhost:8082/topics"
  ["test","test2","test3"]

# Get info about one topic
$ curl "http://localhost:8082/topics/test"
  {"test":"connect-test","configs":{},"partitions":[{"partition":0,"leader":0,"replicas":[{"broker":0,"leader":true,"in_sync":true}]},{"partition":1,"leader":0,"replicas":[{"broker":1,"leader":true,"in_sync":true}]}]}

# Get info about a topic's partitions
$ curl "http://localhost:8082/topics/test/partitions"
  [{"partition":0,"leader":0,"replicas":[{"broker":0,"leader":true,"in_sync":true}]},{"partition":1,"leader":0,"replicas":[{"broker":1,"leader":true,"in_sync":true}]}]

小编用过的两种报文组织格式:

JSON格式

# Produce a message using JSON with the value '{ "foo": "bar" }' to the topic test
$ curl -X POST -H "Content-Type: application/vnd.kafka.json.v1+json" \
      --data '{"records":[{"value":{"foo":"bar"}}]}' "http://localhost:8082/topics/jsontest"
  {"offsets":[{"partition":0,"offset":0,"error_code":null,"error":null}],"key_schema_id":null,"value_schema_id":null}

# Create a consumer for JSON data, starting at the beginning of the topic's
# log. Then consume some data from a topic using the base URL in the first response.
# Finally, close the consumer with a DELETE to make it leave the group and clean up
# its resources.
$ curl -X POST -H "Content-Type: application/vnd.kafka.v1+json" \
      --data '{"name": "my_consumer_instance", "format": "json", "auto.offset.reset": "smallest"}' \
      http://localhost:8082/consumers/my_json_consumer
  {"instance_id":"my_consumer_instance",
  "base_uri":"http://localhost:8082/consumers/my_json_consumer/instances/my_consumer_instance"}
$ curl -X GET -H "Accept: application/vnd.kafka.json.v1+json" \
      http://localhost:8082/consumers/my_json_consumer/instances/my_consumer_instance/topics/jsontest
  [{"key":null,"value":{"foo":"bar"},"partition":0,"offset":0}]
$ curl -X DELETE \
      http://localhost:8082/consumers/my_json_consumer/instances/my_consumer_instance
  # No content in response

Binary格式

# Produce a message using binary embedded data with value "Kafka" to the topic test
$ curl -X POST -H "Content-Type: application/vnd.kafka.binary.v1+json" \
      --data '{"records":[{"value":"S2Fma2E="}]}' "http://localhost:8082/topics/binarytest"
  {"offsets":[{"partition":0,"offset":0,"error_code":null,"error":null}],"key_schema_id":null,"value_schema_id":null}

# Create a consumer for binary data, starting at the beginning of the topic's
# log. Then consume some data from a topic using the base URL in the first response.
# Finally, close the consumer with a DELETE to make it leave the group and clean up
# its resources.
$ curl -X POST -H "Content-Type: application/vnd.kafka.v1+json" \
      --data '{"name": "my_consumer_instance", "format": "binary", "auto.offset.reset": "smallest"}' \
      http://localhost:8082/consumers/my_binary_consumer
  {"instance_id":"my_consumer_instance","base_uri":"http://localhost:8082/consumers/my_binary_consumer/instances/my_consumer_instance"}
$ curl -X GET -H "Accept: application/vnd.kafka.binary.v1+json" \
      http://localhost:8082/consumers/my_binary_consumer/instances/my_consumer_instance/topics/binarytest
  [{"key":null,"value":"S2Fma2E=","partition":0,"offset":0}]
$ curl -X DELETE \
      http://localhost:8082/consumers/my_binary_consumer/instances/my_consumer_instance
  # No content in response

还有就是在使用Curl向KafkaRest发送消息时,消息体不能出现汉字,这样会导致一些不能识别的字符,导致error500,内部服务错误。

Kafka在zookeeper中的注册关系:

broker在zookeeper中的注册
为了记录broker的注册信息,在zookeeper上,专门创建了属于kafka的一个节点,其路径为/brokers,如:

[zk: localhost:2181(CONNECTED) 1] ls /brokers
[ids, topics]

Kafka的每个broker启动时,都会到zookeeper中进行注册,告诉zookeeper其broker.id, 在整个集群中,broker.id应该全局唯一,并在zookeeper上创建其属于自己的节点,其节点路径为/brokers/ids/{broker.id}. 如:

[zk: localhost:2181(CONNECTED) 2] ls /brokers/ids
[102, 103]

创建完节点后,kafka会将该broker的broker.name及端口号记录到改节

[zk: localhost:2181(CONNECTED) 4] get /brokers/ids/102
{"jmx_port":-1,"timestamp":"1433209686575","host":"host102","version":1,"port":9092}

外,改broker节点属性为临时节点,当broker会话失效时,zookeeper会删除该节点,这样,我们就可以很方便的监控到broker节点的变化,及时调整负载均衡等。

Topic在zookeeper中的注册

在kafka中,用户可以自定义多个topic,每个topic又可以划分为多个分区,一半情况下,每个分区存储在一个独立的broker上。所有这些topic与broker的对应关系都有zookeeper进行维护。

在zookeeper中,建立专门的节点来记录这些信息,其节点路径为/brokers/topics/{topic_name}

[zk: localhost:2181(CONNECTED) 6] ls /brokers/topics 
[toptic_t, test, my-replicated-topic, mykafka, mykafka6, mykafka5, mykafka4, test6, mykafka3, test7, mykafka2]

[zk: localhost:2181(CONNECTED) 17] get /brokers/topics/mykafka4   
{"version":1,"partitions":{"1":[102,103,104],"2":[103,104,102],"0":[104,102,103]}}

topic 的每一个分区与broker的对应关系,zookeeper通过节点 /brokers/topics/topic.name来记录,如:

当broker启动时,会到对应topic节点下注册自己的broker.id到对应分区的isr列表中,如:

[zk: localhost:2181(CONNECTED) 23] get /brokers/topics/mykafka4/partitions/1/state
{"controller_epoch":15,"leader":102,"version":1,"leader_epoch":2,"isr":[102,103,104]}

同样的,当broker退出数,也会触发zookeeper更新其对应topic分区的isr列表,并决定是否需要做消费者的负载均衡。

consumer在zookeeper中的注册

l 注册新的消费者分组
当新的消费者组注册到zookeeper中时,zookeeper会创建专用的节点来保存相关信息,其节点路径为ls /consumers/{group_id},其节点下有三个子节点,分别为[ids, owners, offsets]。
Ø ids节点:记录该消费组中当前正在消费的消费者;
Ø owners节点:记录该消费组消费的topic信息;
Ø offsets节点:记录每个topic的每个分区的offset[zk: localhost:2181(CONNECTED) 54] get /consumers/test-consumer2-group/offsets/mykafka4/0
142 注册新的消费者
当新的消费者注册到kafka中时,会在/consumers/{group_id}/ids节点下创建临时子节点,并记录相关信息,如:

[zk: localhost:2181(CONNECTED) 57] ls /consumers/test-consumer2-group/ids/test-consumer2-group_dev103-1433562901087-7b517b97
[]
[zk: localhost:2181(CONNECTED) 58] get /consumers/test-consumer2-group/ids/test-consumer2-group_dev103-1433562901087-7b517b97
{"version":1,"subscription":{"mykafka5":1},"pattern":"white_list","timestamp":"1433562901290"}

l 监听消费者分组中消费者的变化
每个消费者都要关注其所属消费者组中消费者数目的变化,即监听/consumers/{group_id}/ids下子节点的变化。一单发现消费者新增或减少,就会触发消费者的负载均衡。

Kafka Producer性能测试:

bin/kafka-producer-perf-test.sh --topic test3 --num-records 500000 --record-size 200 --throughput 10000 --producer.config config/producer.properties

这里写图片描述

猜你喜欢

转载自blog.csdn.net/roczheng1990/article/details/54599095
今日推荐