Build a Kafka high-availability cluster based on Zookeeper (available in practice)

Table of contents

1. Zookeeper cluster construction

1.1 Download & Unzip

1.2 Modify configuration

1.3 Identify nodes

1.4 Start the cluster

1.5 Cluster verification

2. Kafka cluster construction

2.1 Download and unzip

2.2 Copy configuration files

2.3 Modify configuration

2.4 Start the cluster

2.5 Create test topics

2.6 Write data test


1. Zookeeper cluster construction

In order to ensure high availability of the cluster, it is best to have an odd number of nodes in the Zookeeper cluster, with at least three nodes, so a three-node cluster is built here.

1.1 Download & Unzip

Download the corresponding version of Zookeeper. The version I downloaded here is 3.4.14. Official download address: https://archive.apache.org/dist/zookeeper/

# 下载
wget https://archive.apache.org/dist/zookeeper/zookeeper-3.4.14/zookeeper-3.4.14.tar.gz
# 解压
tar -zxvf zookeeper-3.4.14.tar.gz

1.2 Modify configuration

Copy three copies of the zookeeper installation package. Enter the conf directory of the installation directory, copy the configuration sample zoo_sample.cfg to zoo.cfg and modify it. After modification, the contents of the three configuration files are as follows:

zookeeper01 placement:

tickTime=2000
initLimit=10
syncLimit=5
dataDir=/usr/local/zookeeper-cluster/data/01
dataLogDir=/usr/local/zookeeper-cluster/log/01
clientPort=2181

# server.1 这个1是服务器的标识,可以是任意有效数字,标识这是第几个服务器节点,这个标识要写到dataDir目录下面myid文件里
# 指名集群间通讯端口和选举端口
server.1=host1:2287:3387
server.2=host2:2287:3387
server.3=host3:2287:3387

If there are multiple servers, the communication port and election port of each node in the cluster can be the same, and the IP address can be changed to the IP of the host where each node is located.

zookeeper02 configuration, compared with zookeeper01, only dataDir, dataLogDir and clientPort are different:

tickTime=2000
initLimit=10
syncLimit=5
dataDir=/usr/local/zookeeper-cluster/data/02
dataLogDir=/usr/local/zookeeper-cluster/log/02
clientPort=2181

server.1=host1:2287:3387
server.2=host2:2287:3387
server.3=host3:2287:3387

Zookeeper03 configuration, compared with zookeeper01,02, only dataLogDir, dataLogDir and clientPort are different:

tickTime=2000
initLimit=10
syncLimit=5
dataDir=/usr/local/zookeeper-cluster/data/03
dataLogDir=/usr/local/zookeeper-cluster/log/03
clientPort=2181

server.1=host1:2287:3387
server.2=host2:2287:3387
server.3=host3:2287:3387

Configuration parameter description:

1.3 Identify nodes

Create new myid files in the data storage directories of the three nodes respectively, and write the corresponding node identifiers. The Zookeeper cluster identifies the cluster nodes through the myid file, communicates with the nodes through the node communication port and election port configured above, and elects the leader node.

Create storage directory:

# dataDir
mkdir -vp  /usr/local/zookeeper-cluster/data/01
# dataDir
mkdir -vp  /usr/local/zookeeper-cluster/data/02
# dataDir
mkdir -vp  /usr/local/zookeeper-cluster/data/03

Create and write the node ID to the myid file:

#server1
echo "1" > /usr/local/zookeeper-cluster/data/01/myid
#server2
echo "2" > /usr/local/zookeeper-cluster/data/02/myid
#server3
echo "3" > /usr/local/zookeeper-cluster/data/03/myid

1.4 Start the cluster

Start three nodes respectively:

# 启动节点
/usr/app/zookeeper-cluster/bin/zkServer.sh start

1.5 Cluster verification

Use jps to view the process, and use zkServer.sh status to view the status of each node in the cluster. As shown in the figure, the three node processes are all started successfully, and two nodes are follower nodes and one node is the leader node.

2. Kafka cluster construction

2.1 Download and unzip

The official download address of the Kafka installation package: http://kafka.apache.org/downloads. The version downloaded in this use case is 2.2.0. Download command:

# 下载
wget https://www-eu.apache.org/dist/kafka/2.2.0/kafka_2.12-2.2.0.tgz
# 解压
tar -xzf kafka_2.12-2.2.0.tgz

Here is an explanation of the naming rules for kafka installation packages: Take kafka_2.12-2.2.0.tgz as an example. The first 2.12 represents the Scala version number (Kafka is developed using the Scala language), and the following 2.2.0 represents the Kafka version. Number.

2.2 Copy configuration files

Enter the config directory of the decompressed directory and copy three configuration files:

cp server.properties server.properties

2.3 Modify configuration

Modify some configurations in the three configuration files respectively, as follows:

server-1:

# The id of the broker. 集群中每个节点的唯一标识
broker.id=0
# 监听地址
listeners=PLAINTEXT://hostname1:9092
# 数据的存储位置
log.dirs=/usr/local/kafka/logs
# Zookeeper连接地址
zookeeper.connect=hostname1:2181,hostname2:2181,hostname3:2181

server-2:

broker.id=1
listeners=PLAINTEXT://hostname2:9094
log.dirs=/usr/local/kafka/logs
zookeeper.connect=hostname1:2181,hostname2:2181,hostname3:2181

server-3:

broker.id=2
listeners=PLAINTEXT://hostname3:9094
log.dirs=/usr/local/kafka/logs
zookeeper.connect=hostname1:2181,hostname2:2181,hostname3:2181

What needs to be explained here is that log.dirs refers to the storage location of the data log. To be precise, it is the storage location of the partition data, not the location of the program running log. The location of the program running log is configured through log4j.properties in the same directory.

2.4 Start the cluster

Specify different configuration files and start three Kafka nodes. After starting, you can use jps to view the process. There should be three zookeeper processes and three kafka processes.

bin/kafka-server-start.sh config/server.properties

2.5 Create test topics

Create a test topic:

bin/kafka-topics.sh --create --bootstrap-server kubemaster01.xltest.sz.xunlei.com:9092 \
					--replication-factor 3 \
					--partitions 1 --topic my-first-topic

After creation, you can use the following command to view the created topic information:

bin/kafka-topics.sh --describe --bootstrap-server kubemaster01.xltest.sz.xunlei.com:9092 --topic my-first-topic

You can see that partition 0 has three replicas, 0, 1, and 2, and all three replicas are available replicas and are in the ISR (in-sync Replica) list. 1 is the leader replica, which means that the cluster has been The build was successful.

2.6 Write data test

Write data from kubemaster01:

bin/kafka-console-producer.sh --broker-list kubemaster01.xltest.sz.xunlei.com:9092 --topic my-first-topic

Get data from kubenode01:

bin/kafka-console-consumer.sh --bootstrap-server kubenode01.xltest.sz.xunlei.com:9092 --from-beginning --topic my-first-topic

Guess you like

Origin blog.csdn.net/Franklin7B/article/details/132715227