zookeeper mechanism and message queue kafka cluster deployment


1. zookeeper

1. Introduction to zookeeper

Zookeeper: an open source distributed service, an apache project that provides coordination services for distributed frameworks

2. zookeeper features

①, zoonper is a leader, composed of multiple followers

②. More than half of the nodes in the cluster are alive, the cluster is in normal service, and the odd number of nodes is at least 3

③. The global data is consistent. Each server saves a copy of the same data. No matter which server the client connects to, the data is the same

④. Update requests are executed sequentially. Update requests from the same client are executed once in the order in which they are sent, first in first out

⑤ Atomicity, either succeed or fail

3. Zookeeper working mode and mechanism

Zookeeper working mode: file system + notification mechanism
insert image description here

Working Mechanism:

1. When each server goes online, it needs to register information with the zookeeper cluster

2. The client obtains the list of online server information from the zookeeper cluster and monitors

3. When the server goes online and offline, zookeeper needs to update the list information and notify the client

4. The client receives the notification and reacquires the zookeeper online server list

4. Zookeeper application scenarios and election mechanism

Application scenario:

Unified naming service, unified configuration management, unified cluster management, dynamic online and offline service nodes, soft load balancing

Election mechanism:

Start the election mechanism for the first time:

Server 1 starts up, and votes for itself myid (you can set it to be 1 in the cluster by yourself), and there is no clear leader in the locked state

Server 2 starts up and votes for its own myid as 2, and server 1 votes for 2 because the myid of 2 is greater than 1, and it is in a locked state

Server 3 starts up, votes for itself with myid 3, server 1 and 2 vote for server 3 if the myid is smaller, and server 3 becomes the leader

Server 4 starts, and if there is already a leader, it joins the leader and becomes a follower.

Not the first time to start the election mechanism:

SID: server id, used to indicate a machine aggregated by zookeeper cluster, each machine cannot be repeated and myid is the same

ZXID: transaction id, ZXID is a transaction id, used to identify the change of server status. It is related to the logical speed of the server's processing of the client's update request

Epoch: The code name of each leader term. When there is no leader, the value in the same round of voting is the same. This data increases every time a vote is made

①. The one with the higher Epoch value wins directly and becomes the leader

②. The one with the same epoch value and the larger transaction id wins

③. If the transaction id is the same, the one with the larger server id wins

5. Zookeeper cluster deployment

Prepare 3 servers for Zookeeper cluster
192.168.154.10
192.168.154.11
192.168.154.12
1. Preparation before installation

//关闭防火墙
systemctl stop firewalld
systemctl disable firewalld
setenforce 0
//安装 JDK
yum install -y java-1.8.0-openjdk java-1.8.0-openjdk-devel
java -version
//下载安装包
官方下载地址:https://archive.apache.org/dist/zookeeper/

2. Install Zookeeper

cd /opt
tar -zxvf apache-zookeeper-3.6.3-bin.tar.gz
mv apache-zookeeper-3.6.3-bin /usr/local/zookeeper-3.6.3
//修改配置文件
cd /usr/local/zookeeper-3.6.3/conf/
cp zoo_sample.cfg zoo.cfg
vim zoo.cfg
tickTime=2000   #通信心跳时间,Zookeeper服务器与客户端心跳时间,单位毫秒
initLimit=10    #Leader和Follower初始连接时能容忍的最多心跳数(tickTime的数量),这里表示为10*2s
syncLimit=5     #Leader和Follower之间同步通信的超时时间,这里表示如果超过5*2s,Leader认为Follwer死掉,并从服务器列表中删除Follwer
dataDir=/usr/local/zookeeper-3.6.3/data      ●修改,指定保存Zookeeper中的数据的目录,目录需要单独创建
dataLogDir=/usr/local/zookeeper-3.6.3/logs   ●添加,指定存放日志的目录,目录需要单独创建
clientPort=2181   #客户端连接端口
#添加集群信息
server.1=192.168.154.10:3188:3288
server.2=192.168.154.11:3188:3288
server.3=192.168.154.12:3188:3288

server.A=B:C:D
●A is a number, indicating which server number this is. In cluster mode, you need to create a file myid in the directory specified by dataDir in zoo.cfg. There is a data in this file that is the value of A. When Zookeeper starts, read this file, get the data in it and the configuration in zoo.cfg Compare the information to determine which server it is.
● B is the address of this server.
●C is the port through which the Follower of this server exchanges information with the Leader server in the cluster.
●D means that in case the Leader server in the cluster hangs up, a port is needed to re-elect to elect a new Leader, and this port is the port used for communication between the servers during the election.

//拷贝配置好的 Zookeeper 配置文件到其他机器上
scp /usr/local/zookeeper-3.6.3/conf/zoo.cfg 192.168.154.11:/usr/local/zookeeper-3.6.3/conf/
scp /usr/local/zookeeper-3.6.3/conf/zoo.cfg 192.168.154.12:/usr/local/zookeeper-3.6.3/conf/
//在每个节点上创建数据目录和日志目录
mkdir /usr/local/zookeeper-3.6.3/data
mkdir /usr/local/zookeeper-3.6.3/logs
//在每个节点的dataDir指定的目录下创建一个 myid 的文件
echo 1 > /usr/local/zookeeper-3.6.3/data/myid
echo 2 > /usr/local/zookeeper-3.6.3/data/myid
echo 3 > /usr/local/zookeeper-3.6.3/data/myid
//配置 Zookeeper 启动脚本
vim /etc/init.d/zookeeper.sh
#!/bin/bash
#chkconfig:2345 20 90
#description:Zookeeper Service Control Script
ZK_HOME='/usr/local/zookeeper-3.6.3'
case $1 in
start)
	echo "---------- zookeeper 启动 ------------"
	$ZK_HOME/bin/zkServer.sh start
;;
stop)
	echo "---------- zookeeper 停止 ------------"
	$ZK_HOME/bin/zkServer.sh stop
;;
restart)
	echo "---------- zookeeper 重启 ------------"
	$ZK_HOME/bin/zkServer.sh restart
;;
status)
	echo "---------- zookeeper 状态 ------------"
	$ZK_HOME/bin/zkServer.sh status
;;
*)
    echo "Usage: $0 {start|stop|restart|status}"
esac
//	设置开机自启
chmod +x /etc/init.d/zookeeper.sh
chkconfig --add zookeeper.sh
//分别启动 Zookeeper
service zookeeper start
//查看当前状态
service zookeeper status

netstat -lntp | grep 2181

insert image description here
insert image description here
insert image description here

2. Message queue kafka

1. Why there is a message queue

In a high-concurrency environment, synchronous requests are too late to be processed, and the requests are often blocked. For example, a large number of requests concurrently access the database, resulting in row lock tables. Finally, too many request threads accumulate, causing an avalanche.

Avalanche: In the case of high concurrency, the redis server cannot handle a large number of requests at the same time, causing rdis to crash and directly query the database.

2. The benefits of using message queues

1) Decoupling

Allows independent extension or modification of the processing on both sides, only by ensuring that they obey the same interface constraints

2) Recoverability

The failure of some components of the system will not affect the entire system. The message queue reduces the coupling of the process. Even if a message processing process hangs up, the messages added to the queue can still be processed after the system recovers.

3) Cache

It helps to control and optimize the speed of data flow, and solve the inconsistent processing speed of generating consumption information

4) Flexibility & peak processing capacity

It can make key components withstand sudden access pressure without completely crashing due to sudden overload requests

5) Asynchronous communication

Allows the user to enqueue a message without processing it immediately. Put as many messages in the queue as you want, and process them when needed

3. Two modes of message queue

Point-to-point (one-to-one): The consumer actively pulls data, and the consumer deletes the message after the consumer pulls the data generated by the producer.

Publish/subscribe mode (one-to-many is also known as observer mode): consumers will not clear messages after consuming data

1), the producer publishes a message to the topic, and there are multiple consumers at the same time

2) The observer (real-time observation of the consumer's consumption ability, that is, the ability to process data) observes the entire message queue, and sends data to the consumer according to the consumer's ability configuration

3. Introduction to Kafka

Kafka: It is a distributed, partition-supported, multi-copy message queue (MQ message queue) based on publish/subscribe mode, mainly used for real-time processing of logs and big data

4. Features of kafka

1) High throughput can process hundreds of thousands of messages per second, and the lowest delay is only a few milliseconds.

2) Persistence, reliability, and a complete message storage mechanism are stored on disk to ensure high efficiency and persistence of data

3) Distributed, the producer data will be stored on all machines, and the data will not be lost when one machine is linked.

4) Fault tolerance, allowing nodes in the cluster to fail, allowing replicas - 1 node to fail

5) High concurrency, supporting thousands of clients to read and write at the same time

5. Introduction to kafka system architecture terms

①broker: A kafka server is a broker. A cluster consists of multiple brokers, and a broker can accommodate multiple topics

②Produer: Producer. That is, the party who writes the message writes the message to the broker

③Consumer: consumer. That is, the party that reads the message reads the message from the broker

④ Zookeeper: Kafka uses zookeeper to manage the metadata of the cluster and the election of the controller.

⑤topic: topic. Each message belongs to a certain topic. Kafka divides messages by topic, which is a logical classification

⑥partition: the partition

Messages under the same topic can be further divided into multiple partitions. A partition belongs to only one topic. Kafka only guarantees that the data in the partition is in order, and does not guarantee that the data in different partitions in the topic is in order. Each topic must be at least There is a partition, and the data in each partition is stored using multiple segment files.

Partition data routing rules:

1. If partition is specified, use it directly. 2. If partition is not specified but key is specified, partition will be selected according to the modulo of key. 3. If neither is specified, partition will be selected by polling

The reason for the partition: why are there multiple partitions under the topic?

1. It is convenient to expand in the cluster. Each partition can be adjusted to adapt to the machine where it is located, and a topic can be composed of multiple Partitions, so the entire cluster can adapt to data of any size;

2. It can improve concurrency, because it can read and write in Partition;

⑦Replica: Copy. A partition can have multiple copies to improve disaster recovery. Generally, two copies of a partition are set

⑧Offset: Offset. Where does the consumer's record on zookeeper visit?

⑨The leader is responsible for reading and writing, and the follower is only responsible for copying and backup

6. Kafka architecture and process

1), the producer's production data is transmitted to the broker, that is, the kafka server cluster

2), Kafka cluster stores data in topic topics, and each topic topic has multiple fragments (shards are backed up in other topics)

3) Data is stored in fragments, the kafka cluster is registered in zookeeper, and zookeeper notifies consumers of the online list of kafka servers

4), the consumer receives the online list notified by zookeeper, and pulls data from the broker

5), the consumer saves the offset to zookeeper, so as to record where it consumes when it is down
insert image description here

7. Kafka cluster deployment

Three machine operations based on zookeeper

1. Install Kafka

cd /opt/
tar zxvf kafka_2.13-2.7.1.tgz
mv kafka_2.13-2.7.1 /usr/local/kafka
//修改配置文件
cd /usr/local/kafka/config/
cp server.properties{
    
    ,.bak}
vim server.properties
broker.id=021行,broker的全局唯一编号,每个broker不能重复,因此要在其他机器上配置 broker.id=1、broker.id=2
listeners=PLAINTEXT://192.168.154.10:909231行,指定监听的IP和端口,如果修改每个broker的IP需区分开来,也可保持默认配置不用修改
num.network.threads=3    #42行,broker 处理网络请求的线程数量,一般情况下不需要去修改
num.io.threads=8         #45行,用来处理磁盘IO的线程数量,数值应该大于硬盘数
socket.send.buffer.bytes=102400       #48行,发送套接字的缓冲区大小
socket.receive.buffer.bytes=102400    #51行,接收套接字的缓冲区大小
socket.request.max.bytes=104857600    #54行,请求套接字的缓冲区大小
log.dirs=/usr/local/kafka/logs        #60行,kafka运行日志存放的路径,也是数据存放的路径
num.partitions=1    #65行,topic在当前broker上的默认分区个数,会被topic创建时的指定参数覆盖
num.recovery.threads.per.data.dir=1    #69行,用来恢复和清理data下数据的线程数量
log.retention.hours=168    #103行,segment文件(数据文件)保留的最长时间,单位为小时,默认为7天,超时将被删除
log.segment.bytes=1073741824    #110行,一个segment文件最大的大小,默认为 1G,超出将新建一个新的segment文件
zookeeper.connect=192.168.154.10:2181,192.168.154.11:2181,192.168.154.12:2181123行,配置连接Zookeeper集群地址
//修改环境变量
vim /etc/profile
export KAFKA_HOME=/usr/local/kafka
export PATH=$PATH:$KAFKA_HOME/bin

source /etc/profile
//配置 Zookeeper 启动脚本
vim /etc/init.d/kafka.sh
#!/bin/bash
#chkconfig:2345 22 88
#description:Kafka Service Control Script
KAFKA_HOME='/usr/local/kafka'
case $1 in
start)
	echo "---------- Kafka 启动 ------------"
	${
    
    KAFKA_HOME}/bin/kafka-server-start.sh -daemon ${
    
    KAFKA_HOME}/config/server.properties
;;
stop)
	echo "---------- Kafka 停止 ------------"
	${
    
    KAFKA_HOME}/bin/kafka-server-stop.sh
;;
restart)
	$0 stop
	$0 start
;;
status)
	echo "---------- Kafka 状态 ------------"
	count=$(ps -ef | grep kafka | egrep -cv "grep|$$")
	if [ "$count" -eq 0 ];then
        echo "kafka is not running"
    else
        echo "kafka is running"
    fi
;;
*)
    echo "Usage: $0 {start|stop|restart|status}"
esac
//设置开机自启
chmod +x /etc/init.d/kafka.sh
chkconfig --add kafka.sh

//分别启动 Kafka
service kafka start

service kafka status
netstat -lntp | grep 9092

insert image description here
insert image description here
insert image description here

2. Kafka command line operation

//创建topic
kafka-topics.sh --create --zookeeper 192.168.154.10:2181,192.168.154.11:2181,192.168.154.12:2181 --replication-factor 2 --partitions 3 --topic test

insert image description here

–zookeeper: Define the address of the zookeeper cluster server. If there are multiple IP addresses separated by commas, generally use one IP –replication
-factor: Define the number of partition copies, 1 represents a single copy, 2 is recommended
–partitions: Define the number of partitions
– topic: define the topic name

//查看当前服务器中的所有 topic
kafka-topics.sh --list --zookeeper 192.168.154.10:2181,192.168.154.11:2181,192.168.154.12:2181 

insert image description here

//查看某个 topic 的详情
kafka-topics.sh  --describe --zookeeper 192.168.154.10:2181,192.168.154.11:2181,192.168.154.12:2181 

insert image description here

//发布消息
kafka-console-producer.sh --broker-list 192.168.154.10:9092,192.168.154.11:9092,192.168.154.12:9092  --topic test

insert image description here

//消费消息
kafka-console-consumer.sh --bootstrap-server 192.168.154.10:9092,192.168.154.11:9092,192.168.154.12:9092 --topic test --from-beginning

insert image description here

--from-beginning: will read all the previous data in the topic

//修改分区数
kafka-topics.sh --zookeeper 192.168.154.10:2181,192.168.154.11:2181,192.168.154.12:2181 --alter --topic test --partitions 6

insert image description here

查看 topic 的详情
kafka-topics.sh  --describe --zookeeper 192.168.154.10:2181,192.168.154.11:2181,192.168.154.12:2181 

insert image description here

//删除 topic
kafka-topics.sh --delete --zookeeper 192.168.154.10:2181,192.168.154.11:2181,192.168.154.12:2181 --topic test

insert image description here

3. In-depth Kafka architecture

1. Kafka workflow and file storage mechanism

Messages in Kafka are classified by topic. Producers produce messages and consumers consume messages, all of which are topic-oriented.

Topic is a logical concept, while partition is a physical concept. Each partition corresponds to a log file, and the log file stores the data produced by the producer. The data produced by the Producer will be continuously appended to the end of the log file, and each piece of data has its own offset. Each consumer in the consumer group will record which offset it consumes in real time, so that when the error is restored, it can continue to consume from the last position.

Since the messages produced by the producer will be continuously appended to the end of the log file, in order to prevent the inefficiency of data location caused by the log file being too large, Kafka adopts a fragmentation and indexing mechanism to divide each partition into multiple segments. Each segment corresponds to two files: ".index" file and ".log" file. These files are located in a folder, and the naming rule of the folder is: topic name + partition number. For example, if the topic test has three partitions, the corresponding folders are test-0, test-1, and test-2.

The index and log files are named after the offset of the first message of the current segment.

The ".index" file stores a large amount of index information, and the ".log" file stores a large amount of data. The metadata in the index file points to the physical offset address of the message in the corresponding data file.

2. Data reliability guarantee

In order to ensure that the data sent by the producer can be reliably sent to the specified topic, after each partition of the topic receives the data sent by the producer, it needs to send an ack (acknowledgment confirmation) to the producer. If the producer receives the ack, it will Send the next round, otherwise resend the data.

3. Data consistency problem

LEO: refers to the maximum offset of each copy;
HW: refers to the largest offset that consumers can see, and the smallest LEO among all copies.

(1) Follower failure
After a follower fails, the follower will be temporarily kicked out of the ISR (a Follower collection maintained by the Leader that is synchronized with the Leader). After the follower recovers, the follower will read the last HW recorded on the local disk, and The part of the log file higher than the HW is intercepted, and the synchronization starts from the HW to the leader. After the follower's LEO is greater than or equal to the Partition's HW, that is, after the follower catches up with the leader, it can rejoin the ISR.

(2) Leader failure
After the leader fails, a new leader will be selected from the ISR. After that, in order to ensure the data consistency between multiple copies, the rest of the followers will first set their log files higher than the part of the HW Cut off, and then synchronize the data from the new leader.

Note: This can only guarantee data consistency between replicas, and does not guarantee that data will not be lost or duplicated.

4. ack response mechanism

For some unimportant data, the data reliability requirements are not very high, and a small amount of data loss can be tolerated, so there is no need to wait for all followers in the ISR to receive successfully. Therefore, Kafka provides users with three reliability levels, and users can choose according to the balance between reliability and delay requirements.

When the producer sends data to the leader, the level of data reliability can be set through the request.required.acks parameter:
●0: This means that the producer does not need to wait for the confirmation from the broker to continue sending the next batch of messages. In this case, the data transmission efficiency is the highest, but the data reliability is indeed the lowest. There is a possibility of data loss when a broker fails.

● 1 (default configuration): This means that the producer sends the next message after the leader in the ISR has successfully received the data and received confirmation. If the leader fails before the follower syncs successfully, data will be lost.

●-1 (or all): The producer needs to wait for all followers in the ISR to confirm receipt of the data before it is considered as a sending completion, with the highest reliability. However, if the leader fails after the follower synchronization is completed and before the broker sends the ack, it will cause data duplication.

The performance of the three mechanisms decreases in turn, and the data reliability increases in turn.

Note: Kafka before version 0.11 can't do anything about it. It can only ensure that the data is not lost, and then the downstream consumers will deduplicate the data globally. In Kafka version 0.11 and later, a major feature is introduced: idempotence. The so-called idempotence means that no matter how many times the Producer sends repeated data to the Server, the Server will only persist one piece of data.

5, Filebeat+Kafka+ELK Department

Zookeeper+Kafka cluster has been deployed on the previous basis

192.168.154.13 Filebeat
192.168.154.14 Elasticsearch
192.168.154.15 Logstash Kiabana

insert image description here
1. Deploy Filebeat

cd /usr/local/filebeat
vim filebeat.yml

insert image description here

insert image description here
2. Start filebeat

./filebeat -e -c filebeat.yml

3. Create a new Logstash configuration file on the node where the Logstash component is located

/etc/logstash/conf.d
vim apache.conf
input {
    
    
    kafka {
    
    
        bootstrap_servers => "192.168.154.10:9092,192.168.154.11:9092,192.168.154.12:9092"
        topics  => "apache"
        type => "filebeat-kafka"
        codec => "json"
        auto_offset_reset => "latest"
        decorate_events => true
  }
}

output {
    
    
    if "access" in [tags] {
    
    
    elasticsearch {
    
    
        hosts => ["192.168.154.14:9200"]
        index => "apache_right-%{+YYYY.MM.dd}"
    }
   }
   if "error" in [tags] {
    
    
     elasticsearch {
    
    
        hosts => ["192.168.154.14:9200"]
        index => "apache_error-%{+YYYY.MM.dd}"
    }
   }
    stdout {
    
     codec => rubydebug }
}

insert image description here

4. Start logstash

logstash -f filebeat.conf

5. Browser access http://192.168.154.15:5601
insert image description here

insert image description here

Guess you like

Origin blog.csdn.net/ll945608651/article/details/130925797