Kafka-配置与安装

   一. Kafka下载与解压 
 

Kafka包下载：https://mirrors.cnnic.cn/apache/kafka/
上传解压：tar zxvf kafka_2.10-0.10.2.1.tgz

 
  二. Kafka目录简介 
 

/bin是操作kafka的可执行脚本（包含windows版脚本）
/config是kafka相关的配置文件
/libs是依赖库
/logs是日志数据目录。

 
  kafka把server端日志分为5种类型：server、request、state、log-cleaner和controller 
 

 
  三. Kafka配置和启动 
 

Zookeeper配置

 
  (1) Zookeeper下载： 
  http://apache.fayea.com/zookeeper/current/

 
  解压：tar zxvf zookeeper-3.4.10.tar.gz 
 

 
  (2) Zookeeper配置 
 

 
  配置文件路径：/conf/zoo_sample.cfg 
 

 
  zoo_sample.cfd复制一份并更名为zoo.cfg，修改参数如下： 
 

 
  tickTime=2000：是Zookeeper服务器之间或服务器与客户端之间维持心跳的时间间隔，即每个tickTime时间就会发送一个心跳。 
 

   initLimit=10：初始化连接时最长能忍受多少个心跳时间间隔数 
 

   syncLimit=5：这个配置项标识 Leader 与 Follower 之间发送消息，请求和应答时间长度，最长不能超过多少个 tickTime 的时间长度，总的时间长度 
 

   就是 2*tickTime=4 秒 
 

   dataDir=/tmp/zookeeper：Zookeeper保存数据的目录。默认情况下Zookeeper也会将写数据的日志文件写入该目录。 
 

   dataLogDir=/tmp/zookeeper：日志目录，路径默认与dataDir一致，可设置。 
 

   clientPort=2181: 客户端与Zookeeper服务器连接的默认端口，Zookeeper会监听这个端口，接收客户端的请求。 
 

 
  zoo.cfg： 
 

   # The number of milliseconds of each tick 
 

   tickTime=2000 
 

   # The number of ticks that the initial 
 

   # synchronization phase can take 
 

   initLimit=10 
 

   # The number of ticks that can pass between 
 

   # sending a request and getting an acknowledgement 
 

   syncLimit=5 
 

   # the directory where the snapshot is stored. 
 

   # do not use /tmp for storage, /tmp here is just 
 

   # example sakes. 
 

 
  dataDir=/usr/local/bigdata/zk/data 
 

 
  dataLogDir=/usr/local/bigdata/zk/log 
 

   # the port at which the clients will connect 
 

   clientPort=2181 
 

   # the maximum number of client connections. 
 

   # increase this if you need to handle more clients 
 

   #maxClientCnxns=60 
 

#

   # Be sure to read the maintenance section of the 
 

   # administrator guide before turning on autopurge. 
 

#

   # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance 
 

#

   # The number of snapshots to retain in dataDir 
 

   #autopurge.snapRetainCount=3 
 

   # Purge task interval in hours 
 

   # Set to "0" to disable auto purge feature 
 

   #autopurge.purgeInterval=1 
 

 
  #Zookeeper集群配置: 
 

   server.1=slave1:2888:3888 
 

   server.2=slave2:2888:3888 
 

 
  server.3=slave3:2888:3888 
  IP在host中配置了,也可直接填写

   (3) Zookeeper的bin目录： 
 

   启动： ./zkServer.sh start 
 

   停止： ./zkServer.sh stop 
 

Kafka配置

 
  (1) 配置文件路径：/config/server.properties 
 

 
  主要配置参数：broker.id（唯一的）、log.dir（日志路径）和zookeeper.connect（单个或集群）。 
 

 
  配置参考: 
  https://blog.csdn.net/lizhitao/article/details/25667831

   server.properties 
  : 
 

 
  # Licensed to the Apache Software Foundation (ASF) under one or more 
 

   # contributor license agreements. See the NOTICE file distributed with 
 

   # this work for additional information regarding copyright ownership. 
 

   # The ASF licenses this file to You under the Apache License, Version 2.0 
 

   # (the "License"); you may not use this file except in compliance with 
 

   # the License. You may obtain a copy of the License at 
 

#

   # http://www.apache.org/licenses/LICENSE-2.0 
 

#

   # Unless required by applicable law or agreed to in writing, software 
 

   # distributed under the License is distributed on an "AS IS" BASIS, 
 

   # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 
 

   # See the License for the specific language governing permissions and 
 

   # limitations under the License. 
 

   # see kafka.server.KafkaConfig for additional details and defaults 
 

   ############################# Server Basics ############################# 
 

   # The id of the broker. This must be set to a unique integer for each broker. 
 

 
  broker.id=5  
 

   #hostname 
 

 
  host.name=ip 
 

   port=9092 
 

   # Switch to enable topic deletion or not, default value is false 
 

   #delete.topic.enable=true 
 

   delete.topic.enable=true 
 

   ############################# Socket Server Settings ############################# 
 

   # The address the socket server listens on. It will get the value returned from 
 

   # java.net.InetAddress.getCanonicalHostName() if not configured. 
 

   # FORMAT: 
 

   # listeners = listener_name://host_name:port 
 

   # EXAMPLE: 
 

   # listeners = PLAINTEXT://your.host.name:9092 
 

   #listeners=PLAINTEXT://:9092 
 

   # Hostname and port the broker will advertise to producers and consumers. If not set, 
 

   # it uses the value for "listeners" if configured. Otherwise, it will use the value 
 

   # returned from java.net.InetAddress.getCanonicalHostName(). 
 

   #advertised.listeners=PLAINTEXT://your.host.name:9092 
 

   # Maps listener names to security protocols, the default is for them to be the same. See the config documentation for more details 
 

   #listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL 
 

   # The number of threads handling network requests 
 

   num.network.threads=3 
 

   # The number of threads doing disk I/O 
 

   num.io.threads=8 
 

   # The send buffer (SO_SNDBUF) used by the socket server 
 

   socket.send.buffer.bytes=102400 
 

   # The receive buffer (SO_RCVBUF) used by the socket server 
 

   socket.receive.buffer.bytes=102400 
 

   # The maximum size of a request that the socket server will accept (protection against OOM) 
 

   socket.request.max.bytes=104857600 
 

   ############################# Log Basics ############################# 
 

   # A comma seperated list of directories under which to store log files 
 

   #单独设置一个路径，不要放到tmp/目录下 
 

 
  log.dirs 
  =/opt/kafka_2.10-0.10.2.1/kafka-logs

   # The default number of log partitions per topic. More partitions allow greater 
 

   # parallelism for consumption, but this will also result in more files across 
 

   # the brokers. 
 

   num.partitions=3 
 

   # The number of threads per data directory to be used for log recovery at startup and flushing at shutdown. 
 

   # This value is recommended to be increased for installations with data dirs located in RAID array. 
 

   num.recovery.threads.per.data.dir=1 
 

   ############################# Log Flush Policy ############################# 
 

   # Messages are immediately written to the filesystem but by default we only fsync() to sync 
 

   # the OS cache lazily. The following configurations control the flush of data to disk. 
 

   # There are a few important trade-offs here: 
 

   # 1. Durability: Unflushed data may be lost if you are not using replication. 
 

   # 2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush. 
 

   # 3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to exceessive seeks. 
 

   # The settings below allow one to configure the flush policy to flush data after a period of time or 
 

   # every N messages (or both). This can be done globally and overridden on a per-topic basis. 
 

   # The number of messages to accept before forcing a flush of data to disk 
 

   #log.flush.interval.messages=10000 
 

   log.flush.interval.messages=1000 
 

   # The maximum amount of time a message can sit in a log before we force a flush 
 

   #log.flush.interval.ms=1000 
 

   log.flush.interval.ms=3000 
 

   ############################# Log Retention Policy ############################# 
 

   # The following configurations control the disposal of log segments. The policy can 
 

   # be set to delete segments after a period of time, or after a given size has accumulated. 
 

   # A segment will be deleted whenever *either* of these criteria are met. Deletion always happens 
 

   # from the end of the log. 
 

   # The minimum age of a log file to be eligible for deletion due to age 
 

   log.retention.hours=24 
 

   # A size-based retention policy for logs. Segments are pruned from the log as long as the remaining 
 

   # segments don't drop below log.retention.bytes. Functions independently of log.retention.hours. 
 

   #log.retention.bytes=1073741824 
 

   # The maximum size of a log segment file. When this size is reached a new log segment will be created. 
 

   log.segment.bytes=1073741824 
 

   # The interval at which log segments are checked to see if they can be deleted according 
 

   # to the retention policies 
 

   log.retention.check.interval.ms=60000 
 

   ############################# Zookeeper ############################# 
 

   # Zookeeper connection string (see zookeeper docs for details). 
 

   # This is a comma separated host:port pairs, each corresponding to a zk 
 

   # server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002". 
 

   # You can also append an optional chroot string to the urls to specify the 
 

   # root directory for all kafka znodes. 
 

 
  # Kafka与Zookeeper集群配置: 
 

 
  zookeeper.connect=ip1:2181,ip2:2181,ip3:2181 
 

   # Timeout in ms for connecting to zookeeper 
 

   zookeeper.connection.timeout.ms=6000 
 

   (2) kafka启动命令： 
 

   进入到kafka根目录:/opt/kafka_2.10-0.10.2.1/bin 
 

 
  单机版 
 

   zookeeper启动： ./zookeeper-server-start.sh /opt/kafka_2.10-0.10.2.1/config/zookeeper.properties & （kafka自带的zookeeper） 
 

   kafka启动： ./kafka-server-start.sh /opt/kafka_2.10-0.10.2.1/config/server.properties & 
 

 
  集群版 
 

   zookeeper启动：每台机机器的zookeeper都要启动，进入bin目录下： ./zkServer.sh start 
 

   kafka启动：每台机器的kafka都要启动，进入bin目录下: ./kafka-server-start.sh /opt/kafka_2.10-0.10.2.1/config/server.properties & 
 

   (3) kafka停止命令：./kafka-server-stop.sh 
 

   四. Kafka常用命令： 
 

   (1) 
   创建Topic: ./kafka-topics.sh --create --zookeeper 10.12.9.215:2181 --replication-factor 1 --partitions 1 --topic test 
 

   (2) 
   Producer发送消息: ./kafka-console-producer.sh --broker-list ip:9092 --topic test （单机模式） 
  (新开一个窗口) 
 

   ./kafka-console-producer.sh --broker-list ip1:9092,ip2:9092,ip3:9092 --topic test (集群模式) 
 

   (3) 
  Consumer接收消息: ./kafka-console-consumer.sh --bootstrap-server ip:9092 --topic test --from-beginning 
  (新开一个窗口) 
 

   ./kafka-console-consumer.sh -- 
  zookeeper ip1:2181,ip2:2181,ip3:2181 --topic test --from-beginning （集群） 
 

   (4) 
  查看所有的主题: ./kafka-topics.sh --list --zookeeper ip:2181 
 

   (5) 
  查看topic的详细信息: ./kafka-topics.sh -zookeeper ip:2181 -describe -topic test 
 

   (6) 
  为topic增加partition： ./kafka-topics.sh –zookeeper ip:2181 –alter –partitions 20 –topic testKJ1 
 

   (7) 
  删除topic: ./kafka-topics.sh --zookeeper ip:2181 --delete --topic test 
 

   ---------------------------- 
  注意：文中的ip都要替换成实际地址------------------------------- 
 

猜你喜欢