Kafka series (two) - based high availability cluster ZooKeeper build Kafka

A, Zookeeper Cluster Setup

To ensure high availability cluster, the cluster nodes Zookeeper is preferably an odd number, a minimum of three nodes, so here to build a cluster of three nodes.

1.1 Download & unzip

Download the corresponding version Zookeeper, here I downloaded version 3.4.14. Official Download: https://archive.apache.org/dist/zookeeper/

# 下载
wget https://archive.apache.org/dist/zookeeper/zookeeper-3.4.14/zookeeper-3.4.14.tar.gz
# 解压
tar -zxvf zookeeper-3.4.14.tar.gz

1.2 modify the configuration

Three copies zookeeper installation package. Respectively, into the installation directory confdirectory, copy the configuration template zoo_sample.cfgto zoo.cfgmodify it, and the three modified configuration file contents are as follows:

zookeeper01 configuration:

tickTime=2000
initLimit=10
syncLimit=5
dataDir=/usr/local/zookeeper-cluster/data/01
dataLogDir=/usr/local/zookeeper-cluster/log/01
clientPort=2181

# server.1 这个1是服务器的标识,可以是任意有效数字,标识这是第几个服务器节点,这个标识要写到dataDir目录下面myid文件里
# 指名集群间通讯端口和选举端口
server.1=127.0.0.1:2287:3387
server.2=127.0.0.1:2288:3388
server.3=127.0.0.1:2289:3389

If multiple servers, each node in the cluster communication port and the port may be the same election, the host IP address changes each node where IP can be.

zookeeper02 configuration, compared with zookeeper01, only dataLogDirand dataLogDirdifferent:

tickTime=2000
initLimit=10
syncLimit=5
dataDir=/usr/local/zookeeper-cluster/data/02
dataLogDir=/usr/local/zookeeper-cluster/log/02
clientPort=2182

server.1=127.0.0.1:2287:3387
server.2=127.0.0.1:2288:3388
server.3=127.0.0.1:2289:3389

zookeeper03 configuration, compared with zookeeper01,02, only dataLogDirand dataLogDirdifferent:

tickTime=2000
initLimit=10
syncLimit=5
dataDir=/usr/local/zookeeper-cluster/data/03
dataLogDir=/usr/local/zookeeper-cluster/log/03
clientPort=2183

server.1=127.0.0.1:2287:3387
server.2=127.0.0.1:2288:3388
server.3=127.0.0.1:2289:3389

Configuration parameters:

  • tickTime : a base time calculating unit. Session timeout example: N * tickTime;
  • initLimit : a cluster, and allow connections from the node to the master node synchronizing connection initialization time, expressed in multiples tickTime;
  • syncLimit : a cluster, the master node and sent from the Master node between the message, the request and response time length (heartbeat mechanism);
  • dataDir : data storage location;
  • dataLogDir : log directory;
  • the clientPort : a port for connecting client, default 2181

1.3 identifies the node

New data are stored in the directory three nodes myidfile and write the corresponding node identifier. Zookeeper clusters by myidthe file identification cluster nodes and nodes to node communication via the communication ports and port configuration above elections elect a leader node.

Create a storage directory:

# dataDir
mkdir -vp  /usr/local/zookeeper-cluster/data/01
# dataDir
mkdir -vp  /usr/local/zookeeper-cluster/data/02
# dataDir
mkdir -vp  /usr/local/zookeeper-cluster/data/03

Create and node identification is written to myidthe file:

#server1
echo "1" > /usr/local/zookeeper-cluster/data/01/myid
#server2
echo "2" > /usr/local/zookeeper-cluster/data/02/myid
#server3
echo "3" > /usr/local/zookeeper-cluster/data/03/myid

Start Cluster 1.4

It was started three nodes:

# 启动节点1
/usr/app/zookeeper-cluster/zookeeper01/bin/zkServer.sh start
# 启动节点2
/usr/app/zookeeper-cluster/zookeeper02/bin/zkServer.sh start
# 启动节点3
/usr/app/zookeeper-cluster/zookeeper03/bin/zkServer.sh start

1.5 Cluster Verification

Use jps view the process and use zkServer.sh statusto view each cluster node status. Three nodes are shown in process started successfully, and two nodes follower node, a node is a leader node.

Two, Kafka Cluster Setup

2.1 download, unzip

Kafka official installation package download address: http://kafka.apache.org/downloads , this use case download version 2.2.0, download command:

# 下载
wget https://www-eu.apache.org/dist/kafka/2.2.0/kafka_2.12-2.2.0.tgz
# 解压
tar -xzf kafka_2.12-2.2.0.tgz

Here j explain the naming kafka installation package: with kafka_2.12-2.2.0.tgz, for example, 2.12 in front of representatives of Scala's version number (Kafka uses Scala language development), behind the 2.2.0 version number represents Kafka's.

2.2 copy of the configuration file

Enter decompression directory configdirectory, copy the three profiles:

# cp server.properties server-1.properties
# cp server.properties server-2.properties
# cp server.properties server-3.properties

2.3 modify the configuration

Modifying portion are arranged three configuration file as follows:

server-1.properties:

# The id of the broker. 集群中每个节点的唯一标识
broker.id=0
# 监听地址
listeners=PLAINTEXT://hadoop001:9092
# 数据的存储位置
log.dirs=/usr/local/kafka-logs/00
# Zookeeper连接地址
zookeeper.connect=hadoop001:2181,hadoop001:2182,hadoop001:2183

server-2.properties:

broker.id=1
listeners=PLAINTEXT://hadoop001:9093
log.dirs=/usr/local/kafka-logs/01
zookeeper.connect=hadoop001:2181,hadoop001:2182,hadoop001:2183

server-3.properties:

broker.id=2
listeners=PLAINTEXT://hadoop001:9094
log.dirs=/usr/local/kafka-logs/02
zookeeper.connect=hadoop001:2181,hadoop001:2182,hadoop001:2183

It should be noted here that log.dirsrefers to the storage location of the log data, to be exact, it is the storage location, not the location of the partition running log data. Location running log is in the same directory by log4j.propertiesconfiguring.

Start Cluster 2.4

Specify a different profile, starting three Kafka nodes. After you start using jps view the process, then there should be three and three zookeeper process kafka process.

bin/kafka-server-start.sh config/server-1.properties
bin/kafka-server-start.sh config/server-2.properties
bin/kafka-server-start.sh config/server-3.properties

2.5 Create a test topic

Create a test Topic:

bin/kafka-topics.sh --create --bootstrap-server hadoop001:9092 \
                    --replication-factor 3 \
                    --partitions 1 --topic my-replicated-topic

After you create can use the following command to view the information created topics:

bin/kafka-topics.sh --describe --bootstrap-server hadoop001:9092 --topic my-replicated-topic

0 partitions you can see there are three copies of 0,1,2, and three copies of copies are available, both in the ISR (in-sync Replica synchronized copy) list, where 1 is the leader of a copy, this time on behalf of the cluster has been build success.

More big data series can be found GitHub open source project : Big Data Getting Started

Guess you like

Origin www.cnblogs.com/heibaiying/p/11371330.html