centos7 install kafka2.12

Kafka overview

Kafka is a distributed system consisting of servers and clients that communicate through high-performance TCP network protocols. It can be deployed on bare metal hardware, virtual machines, and containers in internal and cloud environments.

1) Apache Kafka is an open source messaging system written in Scala. It is an open source messaging system project developed by the Apache Software Foundation.

2) Kafka was originally developed by LinkedIn and open sourced in early 2011. Graduated from Apache Incubator in October 2012. The goal of this project is to provide a unified, high-throughput, low-latency platform for processing real-time data.

3) Kafka is a distributed message queue. Kafka categorizes messages according to Topic when they are stored. The sender is called Producer, and the recipient of the message is called Consumer. In addition, the Kafka cluster is composed of multiple Kafka instances, and each instance (server) is called a broker.

4) Both the Kafka cluster and the consumer rely on the zookeeper cluster to save some meta information to ensure system availability.

Insert picture description here
Kafka architecture diagram

Concepts in Kafka

1) Producer: Producers are client applications that publish (write) events to Kafka.

2) Consumer: Consumers are client applications that subscribe (read and process) these events.

3) Topic: The topic is similar to the folder in the file system, and the event is the file in the folder. Messages are generally distinguished by topic.

4) Consumer Group (CG): User group, Kafka distinguishes consumers by user group. Messages in a topic will be sent to all user groups. However, the message will only be sent to one consumer in the user group.

5) Broker: A kafka server is a broker. A cluster is composed of multiple brokers. A broker can hold multiple topics.

6) Partition: In order to achieve scalability, a very large topic can be distributed to multiple brokers (ie servers), a topic can be divided into multiple partitions, and each partition is an ordered queue. Each message in the partition will be assigned an ordered id (offset). Kafka only guarantees that messages will be sent to the consumer in the order in a partition, and does not guarantee the order of a topic as a whole (among multiple partitions);

7) Offset: Kafka's storage files are named after offset.kafka. The advantage of using offset as the name is to facilitate searching. For example, if you want to find the location at 2049, just find the file 2048.kafka. Of course the first offset is 00000000000.kafka

Zookeeper installation

Since kafka requires Zookeeper service, we need to install Zookeeper first

Introduction to Zookeeper

ZooKeeper is an open source distributed framework that provides basic services for coordinating distributed applications. It exposes a set of common services-Distributed Synchronization, Naming Service, Group Maintenance, etc. to external applications, simplifying the coordination and management of distributed applications.

It is an open source implementation of Google's Chubby. It can be built into a cluster by itself. This zk cluster is used to manage the application cluster, monitor the status of each node in the application cluster, and decide the next reasonable operation based on the feedback information submitted by each node in the application cluster

Zookeeper installation

Download the zookeeper compressed package: https://archive.apache.org/dist/zookeeper/zookeeper-3.4.10/zookeeper-3.4.10.tar.gz
upload to the server to decompress

Deploy a 3-node Zookeeper pseudo-distributed cluster

  1. First, create a directory for cluster installation, called zookeeper. Secondly, decompress three copies of ZooKeeper under this directory to form 3 nodes. ZooKeeper in each directory represents a node.
    This forms the following installation directory structure:
    Insert picture description here
  2. Create a data directory, logs directory and myid file for each node
    . The content of the myid file is the number of the node in the cluster. The number of the zookeeper1 node is written as 1, and the number of zookeeper2 is 2 and the number of zookeeper3 is 3.
  3. Create a configuration file for each node.
    Rename the
    configuration file zoo_sample.cfg to zoo.cfg under zookeeper1. The content of the configuration file zoo.cfg is as follows:
dataDir=/usr/zookeeper/zookeeper1/data
dataLogDir=/usr/zookeeper/zookeeper1/logs
clientPort=2181
server.1=127.0.0.1:8880:7770
server.2=127.0.0.1:8881:7771
server.3=127.0.0.1:8882:7772

In the same way, create zoo.cfg in the corresponding locations of zookeeper2 and zookeeper3, and copy the contents of the file to zookeeper1's zoo.cfg. Just need to change the three configuration items of clientport, dataDir, and dataLogDir. Zookeeper2's clientport is changed to 2182, zookeeper3's clientport is changed to 2183, and dataDir
and dataLogDir are modified to the corresponding directories.

  1. Start the zk cluster
    Enter the bin directory of zookeeper0, the first node of the zookeeper cluster, and start the service
bin/zkServer.sh start

Then, in the same way, start zookeeper1 and zookeeper2 services in turn.

Zookeeper service commands

#启动ZK服务: 
bin/zkServer.sh start 
#查看ZK服务状态: 
bin/zkServer.sh status 
#停止ZK服务: 
bin/zkServer.sh stop 
#重启ZK服务: 
bin/zkServer.sh restart  
#连接服务器: 
zkCli.sh -server 127.0.0.1:2181 

kafka

First, you need to download the
official website address of Kafka on the official website: http://kafka.apache.org/downloads
Insert picture description here

After downloading, upload to centos7 to
decompress

tar -zxvf kafka_2.12-2.1.0.tgz 

Modify the configuration file
cd to the config directory, edit server.properties

broker.id=0
#此处填写你的服务器ip
listeners=PLAINTEXT://192.168.130.128:9092
#选择你的logs存放目录
log.dirs=/usr/kafka2.12/kafka-logs
delete.topic.enable=true
#zookeeper集群信息
zookeeper.connect=192.168.130.128:2181,192.168.130.128:2182,192.168.130.128:2183

Start service

bin/kafka-server-start.sh config/server.properties 

Now create a new topic with a copy of 3

bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 3 --partitions 1 --topic topic_test

View topic information in the cluster

bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic topic_test

Insert picture description here
View the list of kafka topics

bin/kafka-topics.sh --zookeeper localhost:2181 --list

Insert picture description here
View topic data that has been consumed

bin/kafka-console-consumer.sh --bootstrap-server 192.168.130.128:9092 --topic topic_test --from-beginning 

View kafka consumer-group list

bin/kafka-consumer-groups.sh --bootstrap-server 192.168.130.128:9092 --list

Insert picture description here

Delete topic_test

bin/kafka-topics.sh  --delete --zookeeper 127.0.0.1:2181  --topic topic_test

Modify the number of partitions

bin/kafka-topics.sh --zookeeper 192.168.130.128:2181 --alter --topic topic_test --partitions 4

Guess you like

Origin blog.csdn.net/weixin_42494845/article/details/108868889