阅读目录
安装及部署
一、环境配置
-
操作系统:Cent OS 7
-
Kafka版本:0.9.0.0
-
Kafka官网下载:请点击
-
JDK版本:1.7.0_51
-
SSH Secure Shell版本:XShell 5
二、操作过程
1、下载Kafka并解压
-
下载:
curl -L -O http://mirrors.cnnic.cn/apache/kafka/0.9.0.0/kafka_2.10-0.9.0.0.tgz
-
解压:
tar zxvf kafka_2.10-0.9.0.0.tgz
2、Kafka目录介绍
-
/bin 操作kafka的可执行脚本,还包含windows下脚本
-
/config 配置文件所在目录
-
/libs 依赖库目录
-
/logs 日志数据目录,目录kafka把server端日志分为5种类型,分为:server,request,state,log-cleaner,controller
3、配置
-
配置zookeeper
请参考zookeeper
-
进入kafka安装工程根目录编辑config/server.properties
kafka最为重要三个配置依次为:broker.id、log.dir、zookeeper.connect,kafka server端config/server.properties参数说明和解释如下:
4、启动Kafka
-
启动
进入kafka目录,敲入命令 bin/kafka-server-start.sh config/server.properties &
-
检测2181与9092端口
netstat -tunlp|egrep "(2181|9092)" tcp 0 0 :::2181 :::* LISTEN 19787/java tcp 0 0 :::9092 :::* LISTEN 28094/java
说明:
Kafka的进程ID为28094,占用端口为9092
QuorumPeerMain为对应的zookeeper实例,进程ID为19787,在2181端口监听
5、单机连通性测试
启动2个XSHELL客户端,一个用于生产者发送消息,一个用于消费者接受消息。
-
运行producer,随机敲入几个字符,相当于把这个敲入的字符消息发送给队列。
bin/kafka-console-producer.sh --broker-list 192.168.153.118:9092 --topic test
说明:早版本的Kafka,–broker-list 192.168.1.181:9092需改为–zookeeper 192.168.1.181:2181
-
运行consumer,可以看到刚才发送的消息列表。
bin/kafka-console-consumer.sh --zookeeper 192.168.153.118:2181 --topic test --from-beginning
-
注意:
producer,指定的Socket(192.168.1.181+9092),说明生产者的消息要发往kafka,也即是broker
consumer, 指定的Socket(192.168.1.181+2181),说明消费者的消息来自zookeeper(协调转发)
上面的只是一个单个的broker,下面我们来实验一个多broker的集群。
6、搭建一个多个broker的伪集群
刚才只是启动了单个broker,现在启动有3个broker组成的集群,这些broker节点也都是在本机上。
(1)为每一个broker提供配置文件
我们先看看config/server0.properties配置信息:
broker.id=0 listeners=PLAINTEXT://:9092 port=9092 host.name=192.168.1.181 num.network.threads=4 num.io.threads=8 socket.send.buffer.bytes=102400 socket.receive.buffer.bytes=102400 socket.request.max.bytes=104857600 log.dirs=/tmp/kafka-logs num.partitions=5 num.recovery.threads.per.data.dir=1 log.retention.hours=168 log.segment.bytes=1073741824 log.retention.check.interval.ms=300000 log.cleaner.enable=false zookeeper.connect=192.168.1.181:2181 zookeeper.connection.timeout.ms=6000 queued.max.requests =500 log.cleanup.policy = delete
-
说明:
broker.id为集群中唯一的标注一个节点,因为在同一个机器上,所以必须指定不同的端口和日志文件,避免数据被覆盖。
在上面单个broker的实验中,为什么kafka的端口为9092,这里可以看得很清楚。
kafka cluster怎么同zookeeper交互的,配置信息中也有体现。
那么下面,我们仿照上面的配置文件,提供2个broker的配置文件:
-
server1.properties:
按 Ctrl+C 复制代码
-
server2.properties:
broker.id=2 listeners=PLAINTEXT://:9094 port=9094 host.name=192.168.1.181 num.network.threads=4 num.io.threads=8 socket.send.buffer.bytes=102400 socket.receive.buffer.bytes=102400 socket.request.max.bytes=104857600 log.dirs=/tmp/kafka-logs2 num.partitions=5 num.recovery.threads.per.data.dir=1 log.retention.hours=168 log.segment.bytes=1073741824 log.retention.check.interval.ms=300000 log.cleaner.enable=false zookeeper.connect=192.168.1.181:2181 zookeeper.connection.timeout.ms=6000 queued.max.requests =500 log.cleanup.policy = delete
(2)启动所有的broker
命令如下:
bin/kafka-server-start.sh config/server0.properties & #启动broker0 bin/kafka-server-start.sh config/server1.properties & #启动broker1 bin/kafka-server-start.sh config/server2.properties & #启动broker2
查看2181、9092、9093、9094端口
netstat -tunlp|egrep "(2181|9092|9093|9094)" tcp 0 0 :::9093 :::* LISTEN 29725/java tcp 0 0 :::2181 :::* LISTEN 19787/java tcp 0 0 :::9094 :::* LISTEN 29800/java tcp 0 0 :::9092 :::* LISTEN 29572/java
一个zookeeper在2181端口上监听,3个kafka cluster(broker)分别在端口9092,9093,9094监听。
(3)创建topic
bin/kafka-topics.sh --create --topic topic_1 --partitions 1 --replication-factor 3 \--zookeeper localhost:2181 bin/kafka-topics.sh --create --topic topic_2 --partitions 1 --replication-factor 3 \--zookeeper localhost:2181 bin/kafka-topics.sh --create --topic topic_3 --partitions 1 --replication-factor 3 \--zookeeper localhost:2181
查看topic创建情况:
bin/kafka-topics.sh --list --zookeeper localhost:2181 test topic_1 topic_2 topic_3 [root@atman081 kafka_2.10-0.9.0.0]# bin/kafka-topics.sh --describe --zookeeper localhost:2181 Topic:test PartitionCount:1 ReplicationFactor:1 Configs: Topic: test Partition: 0 Leader: 0 Replicas: 0 Isr: 0 Topic:topic_1 PartitionCount:1 ReplicationFactor:3 Configs: Topic: topic_1 Partition: 0 Leader: 2 Replicas: 2,1,0 Isr: 2,1,0 Topic:topic_2 PartitionCount:1 ReplicationFactor:3 Configs: Topic: topic_2 Partition: 0 Leader: 1 Replicas: 1,2,0 Isr: 1,2,0 Topic:topic_3 PartitionCount:1 ReplicationFactor:3 Configs: Topic: topic_3 Partition: 0 Leader: 0 Replicas: 0,2,1 Isr: 0,2,1
上面的有些东西,也许还不太清楚,暂放,继续试验。需要注意的是topic_1的Leader=2
(4)模拟客户端发送,接受消息
-
发送消息
bin/kafka-console-producer.sh --topic topic_1 --broker-list 192.168.1.181:9092,192.168.1.181:9093,192.168.1.181:9094
-
接收消息
bin/kafka-console-consumer.sh --topic topic_1 --zookeeper 192.168.1.181:2181 --from-beginning
需要注意,此时producer将topic发布到了3个broker中,现在就有点分布式的概念了。
(5) kill some broker
kill broker(id=0)
首先,我们根据前面的配置,得到broker(id=0)应该在9092监听,这样就能确定它的PID了。
broker0没kill之前topic在kafka cluster中的情况
bin/kafka-topics.sh --describe --zookeeper localhost:2181 Topic:test PartitionCount:1 ReplicationFactor:1 Configs: Topic: test Partition: 0 Leader: 0 Replicas: 0 Isr: 0 Topic:topic_1 PartitionCount:1 ReplicationFactor:3 Configs: Topic: topic_1 Partition: 0 Leader: 2 Replicas: 2,1,0 Isr: 2,1,0 Topic:topic_2 PartitionCount:1 ReplicationFactor:3 Configs: Topic: topic_2 Partition: 0 Leader: 1 Replicas: 1,2,0 Isr: 1,2,0 Topic:topic_3 PartitionCount:1 ReplicationFactor:3 Configs: Topic: topic_3 Partition: 0 Leader: 2 Replicas: 0,2,1 Isr: 2,1,0
kill之后,再观察,做下对比。很明显,主要变化在于Isr,以后再分析
bin/kafka-topics.sh --describe --zookeeper localhost:2181 Topic:test PartitionCount:1 ReplicationFactor:1 Configs: Topic: test Partition: 0 Leader: -1 Replicas: 0 Isr: Topic:topic_1 PartitionCount:1 ReplicationFactor:3 Configs: Topic: topic_1 Partition: 0 Leader: 2 Replicas: 2,1,0 Isr: 2,1 Topic:topic_2 PartitionCount:1 ReplicationFactor:3 Configs: Topic: topic_2 Partition: 0 Leader: 1 Replicas: 1,2,0 Isr: 1,2 Topic:topic_3 PartitionCount:1 ReplicationFactor:3 Configs: Topic: topic_3 Partition: 0 Leader: 2 Replicas: 0,2,1 Isr: 2,1
测试下,发送消息,接受消息,是否收到影响。
-
发送消息
bin/kafka-console-producer.sh --topic topic_1 --broker-list 192.168.1.181:9092,192.168.1.181:9093,192.168.1.181:9094
-
接收消息
bin/kafka-console-consumer.sh --topic topic_1 --zookeeper 192.168.1.181:2181 --from-beginning
可见,kafka的分布式机制,容错能力还是挺好的~
Kafka介绍
1、kafka有什么?
-
producer 消息的生成者,即发布消息
-
consumer 消息的消费者,即订阅消息
-
broker Kafka以集群的方式运行,可以由一个或多个服务组成,服务即broker
-
zookeeper 协调转发
2、kafka
producers通过网络将消息发送到Kafka集群,集群向消费者提供消息
kafka对消息进行归纳,即topic,也就是说producer发布topic,consumer订阅topic
一. zookeeper单机安装、配置
1、下载zookeeper二进制安装包
-
下载
curl -L -O http://apache.fayea.com/zookeeper/stable/zookeeper-3.4.6.tar.gz
-
解压
tar zxvf zookeeper-3.4.6.tar.gz
2、设置环境变量
ZOOKEEPER_HOME关键字为添加的配置项
# /etc/profile # System wide environment and startup programs, for login setup # Functions and aliases go in /etc/bashrc # It's NOT a good idea to change this file unless you know what you # are doing. It's much better to create a custom.sh shell script in # /etc/profile.d/ to make custom changes to your environment, as this # will prevent the need for merging in future updates. pathmunge () { case ":${PATH}:" in *:"$1":*) ;; *) if [ "$2" = "after" ] ; then PATH=$PATH:$1 else PATH=$1:$PATH fi esac } if [ -x /usr/bin/id ]; then if [ -z "$EUID" ]; then # ksh workaround EUID=`id -u` UID=`id -ru` fi USER="`id -un`" LOGNAME=$USER MAIL="/var/spool/mail/$USER" fi # Path manipulation if [ "$EUID" = "0" ]; then pathmunge /sbin pathmunge /usr/sbin pathmunge /usr/local/sbin else pathmunge /usr/local/sbin after pathmunge /usr/sbin after pathmunge /sbin after fi HOSTNAME=`/bin/hostname 2>/dev/null` HISTSIZE=1000 if [ "$HISTCONTROL" = "ignorespace" ] ; then export HISTCONTROL=ignoreboth else export HISTCONTROL=ignoredups fi export PATH USER LOGNAME MAIL HOSTNAME HISTSIZE HISTCONTROL # By default, we want umask to get set. This sets it for login shell # Current threshold for system reserved uid/gids is 200 # You could check uidgid reservation validity in # /usr/share/doc/setup-*/uidgid file if [ $UID -gt 199 ] && [ "`id -gn`" = "`id -un`" ]; then umask 002 else umask 022 fi for i in /etc/profile.d/*.sh ; do if [ -r "$i" ]; then if [ "${-#*i}" != "$-" ]; then . "$i" else . "$i" >/dev/null 2>&1 fi fi done unset i unset pathmunge JAVA_HOME=/usr/java/jdk1.7.0_51 #zookeeper环境变量设置 ZOOKEEPER_HOME=/usr/local/logdeal/zookeeper-3.4.6 PATH=$JAVA_HOME/bin:$ZOOKEEPER_HOME/bin:$PATH CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$ZOOKEEPER_HOME/lib: TOMCAT_HOME=/usr/local/tomcat7 CATALINA_HOME=/usr/local/tomcat7 export ZOOKEEPER_HOME export JAVA_HOME export PATH export CLASSPATH export TOMCAT_HOME export CATALINA_HOME export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig:$PKG_CONFIG_PATH
3、配置
配置文件存放在$ZOOKEEPER_HOME/conf/目录下,将zoo_sample.cfd文件名称改为zoo.cfg, 缺省的配置内容如下:
# The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataDir=/tmp/zookeeper # the port at which the clients will connect clientPort=2181 # the maximum number of client connections. # increase this if you need to handle more clients #maxClientCnxns=60 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1
-
配置说明:
tickTime:这个时间是作为 Zookeeper 服务器之间或客户端与服务器之间维持心跳的时间间隔,也就是每个 tickTime 时间就会发送一个心跳。
dataDir:顾名思义就是 Zookeeper 保存数据的目录,默认情况下,Zookeeper 将写数据的日志文件也保存在这个目录里。
dataLogDir: log目录, 同样可以是任意目录. 如果没有设置该参数, 将使用和dataDir相同的设置。
clientPort:这个端口就是客户端连接 Zookeeper 服务器的端口,Zookeeper 会监听这个端口,接受客户端的访问请求。
4、启动Zookeeper
当这些配置项配置好后,你现在就可以启动zookeeper了:
./zkServer.sh start #启动 netstat -tunlp|grep 2181 #查看zookeeper端口 ./zkServer.sh stop #停止