Big data - play with data - Kafka in practice

1. Key points of using kafka

Point 1 : Producer is the producer, which sends messages to the Kafka cluster. Before sending the messages, it will classify the messages, namely Topic, topic1, topic2. Topics are topics. Messages can be classified by specifying topics for messages, and consumers can only pay attention to the messages in the Topic they need. Consumer is the consumer, and the consumer continuously pulls messages from the cluster by establishing a long connection with the Kafka cluster, and then can process these messages.
Point 2 : The storage of Kafka has to mention partitions, that is, partitions. When creating a topic, you can specify the number of partitions at the same time. The more partitions, the greater the throughput, but the more resources are required, and at the same time Leading to higher unavailability, after Kafka receives the message sent by the producer, it will store the message in different partitions according to the balance strategy. In each partition, messages are stored sequentially, and the latest received message will be consumed last. When the producer sends a message to the Kafka cluster, it can specify a partition to send, or it can send it by specifying a balance strategy, or not specify, and use the default strategy to randomly store the message in different partitions.
Point 3 : When consumers consume messages, Kafka uses offset to record the current consumption position. In the design of Kafka, there can be multiple different groups to consume messages under the same topic at the same time. For example, there are two different groups For consumption at the same time, their consumption record position offsets are different and do not interfere with each other. For a group, the number of consumers should not exceed the number of partitions, because in a group, each partition can only be bound to one consumer at most, that is, a consumer can consume multiple partitions, and a partition can only It can be consumed by one consumer. Therefore, if the number of consumers in a group is greater than the number of partitions, the redundant consumers will not receive any messages.
Point four: Kafka is a program written in scala that runs on a jvm virtual machine. Although it can also be used on windows, kafka basically runs on a linux server, so we also use linux here to start today's actual combat. First of all, make sure that jdk is installed on your machine. Kafka needs a java operating environment. The previous kafka also needs zookeeper. The new version of kafka has a built-in zookeeper environment, so we can use it directly.

2. Installation

1. jkd installation

View the version of jdk that can be installed

yum -y list java*

Choose a version to install

sudo yum install -y java-1.8.0-openjdk-devel.x86_64

Check the installation result

java -version

If you are curious about where this automatic installation installed jdk, you can actually find them under usr/lib/jvm

2. Kafka installation

Download the latest kafka installation package from the official website http://kafka.apache.org/downloads, choose to download the binary version of the tgz file, we do not choose kafka-3.3.1-src with src, otherwise you will be prompted when you start zookeeper, download here kafka_2.13-3.3.1, drag it to the created /home/lighthouse/kafka directory, if we only need to make the simplest attempt, we only need to decompress it to any directory, here we decompress the kafka compressed package to /home/ lighthouse/kafka directory

tar -zxvf kafka_2.13-3.3.1

3, kafka arrangement

There is a config folder under the kafka decompression directory, which contains our configuration files

  • consumer.properites consumer configuration, here we can use the default
  • producer.properties Producer configuration, here we can use the default
  • server.properties kafka server configuration, this configuration file is used to configure the kafka server, currently only introduces a few basic
    configurations The id of a kafka server should be unique. We can use the default configuration here.
    listeners declares the port number that the kafka server needs to listen to. If you are running a virtual machine on the local machine, you don’t need to configure this item. By default, localhost will be used. The address must be configured if it is running on a remote server, for example:
    listeners=PLAINTEXT://192.168.180.128:9092. And make sure that port 9092 of the server can access
    zookeeper.connect to declare the address of the zookeeper connected to kafka, which needs to be configured as the address of zookeeper. Since the zookeeper in the higher version of kafka is used this time, the default configuration can be used zookeeper.connect
    = localhost:2181

Three, use

1. Start zookeeper

[lighthouse@VM-0-8-centos ~]$ cd kafka/kafka_2.13-3.3.1/
[lighthouse@VM-0-8-centos kafka_2.13-3.3.1]$ bin/zookeeper-server-start.sh config/zookeeper.properties

2. Start kafka

[lighthouse@VM-0-8-centos ~]$ cd kafka/kafka_2.13-3.3.1/
[lighthouse@VM-0-8-centos kafka_2.13-3.3.1]$ ./bin/kafka-server-start.sh ./config/server.properties

insert image description here

3. Create a topic

Kafka manages the same type of data through topic, and using the same topic for the same type of data can make data processing more convenient

[lighthouse@VM-0-8-centos ~]$ cd kafka/kafka_2.13-3.3.1/

Create a topic called sun_demo

[lighthouse@VM-0-8-centos kafka_2.13-3.3.1]$./bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic sun_demo

An error will be reported:
insert image description here
the topic needs to be created in a new way:

[lighthouse@VM-0-8-centos bin]$ ./kafka-topics.sh --create --topic sun_demo --bootstrap-server localhost:9092

The creation result is shown in the figure:
insert image description here
display the topic that has been created

[lighthouse@VM-0-8-centos bin]$ ./kafka-topics.sh --list --bootstrap-server  localhost:9092

insert image description here

4. Create a message consumer

[lighthouse@VM-0-8-centos ~]$ cd kafka/kafka_2.13-3.3.1/bin
[lighthouse@VM-0-8-centos kafka_2.13-3.3.1]$ ./bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic sun_demo --from-beginning

insert image description here

5. Create a message producer

Create a message producer and enter the message content

[lighthouse@VM-0-8-centos bin]$ ./kafka-console-producer.sh --broker-list localhost:9092 --topic sun_demo
>wo shi sunbo shoudao qing huida 

The content received by the message consumer is shown in the figure
insert image description here

Guess you like

Origin blog.csdn.net/s_unbo/article/details/128376517