Cluster deployment of distributed message queue Kafka

1 Overview

Apache Kafka is a distributed high-throughput streaming messaging system built on the ZooKeeper synchronization service. It integrates perfectly with Apache Storm and Spark for real-time streaming data analysis, compared to other messaging systems, Kafka has better throughput, built-in partitioning, data replication and high fault tolerance, so it is ideal for large message processing application scenarios .

For an introduction to Kafka architecture, please check: https://my.oschina.net/feinik/blog/1806488

2 Deployment Diagram

3 Environment preparation before Kafka cluster deployment

3.1 Install Java

It is recommended to install Java 8, please install it yourself.

3.2 Deploy the Zookeeper cluster

3.2.1 Download the Zookeeper installation package

The zk version deployed here is: zookeeper-3.4.9.tar.gz

3.2.2 Installation

1. First install in server1

(1) Decompression: tar -zxvf zookeeper-3.4.9.tar.gz

(2)cd zookeeper-3.4.9/conf

(3)cp zoo_sample.cfg zoo.cfg

(4) Modify the zoo.cfg configuration file as follows

tickTime=2000

# zk数据目录
dataDir=/home/hadoop/app/zookeeper/data

# 客户端连接端口配置
clientPort=2181

initLimit=10

syncLimit=5

# 服务地址,2888为集群内个节点通信的端口,3888为leader选举时使用的端口
server.1=slave1:2888:3888

server.2=slave2:2888:3888

server.3=slave3:2888:3888

2. Copy the same copy of zookeeper-3.4.9 to server2 and server3 servers

3. Configure the environment variables of Zookeeper and start them separately to complete the deployment of the zk cluster

4 Deploy Kafka cluster

4.1 Install and configure

The version installed here is: kafka_2.12-1.1.0.tgz

Note: Install in server1 first, and then copy a copy to server2, server3 server

(1) Decompression

$tar -zxvf kafka_2.12-1.1.0.tgz -C /home/app

(2) Rename

$mv kafka_2.12-1.1.0 kafka

(3) Configure the environment variables of Kafka

(4) Modify the Kafka configuration file server.properties and modify the following configuration items

  • Modify the broker (agent) id identifier, which needs to be unique in the cluster

        broker.id=1

  •   Modify the log storage directory configuration

        log.dirs=/home/app/kafka/log-data

  • Modify the connection address of Zookeeper. Kafka comes with Zookeeper, but here we configure it as our own zk cluster address

        zookeeper.connect=server1:2181,server2:2181,server3:2181

(5) Copy the kafka package deployed in server1 to server2 and server3

(6) Modify the server.properties configuration file of kafka in server2

        broker.id=2

(7) Modify the server.properties configuration file of kafka in server3

        broker.id=3

5 Start the cluster

5.1 Start the Zookeeper cluster first

Use the following commands to start server1, server2, and server3 respectively

$zkServer.sh start

Note: You can also start the Zookeeper cluster through a script, provided that you need to configure passwordless login, the script content is as follows

#!/bin/bash
if(( $# != 1 )) ; then
   echo "Usage: zk.sh {start|stop}";
   exit;
fi

cuser=`whoami`;

for i in {server1,server2,server3};
do
   echo ---------- $i---------------;
   ssh $cuser@$i "cd /home/app/zookeeper; ./bin/zkServer.sh $@";
done

 

 

5.2 Start the Kafka cluster

Use the following commands to start server1, server2, and server3 respectively

$kafka-server-start.sh -daemon /home/app/kafka/config/server.properties

Note: You can also start the Kafka cluster through a script. The script content is as follows

#!/bin/bash
cuser=`whoami`;

for i in {server1,server2,server3};
do
   echo ---------- $i--------------;
   ssh $cuser@$i "/home/app/kafka/bin/kafka-server-start.sh -daemon /home/app/kafka/config/server.properties";
   echo "start complate!"
done

 

5.3 View cluster startup status

Use the jps command to view the service startup process. Server1, server2, and server3 all contain Kafka and QuorumPeerMain service processes, indicating that the cluster is successfully started.

$jps
5506 Kafka
5733 Jps
5212 QuorumPeerMain

6 Kafka Java API Question

6.1 The producer sends a message

public class ProducerClient {

    private Producer<String, String> producer;

    @Before
    public void init() {
        Properties props = new Properties();
        /**
         * broker地址列表,无需指定集群中的所有broker地址,Producer会从给定的broker中找到其他broker的地址信息,
         * 推荐这里配置两个,可以防止broker宕机产生无法连接的问题
         */
        props.put("bootstrap.servers", "server1:9092,server2:9092");
        /**
         * 指定key的序列化方式,Kafka 默认提供了常用的几种Java对象类型的序列化类
         */
        props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        producer = new KafkaProducer<>(props);
    }


    @Test
    public void send() throws Exception {
 //此处未指定key,那么发送的多条消息会被均匀的分布在Topic的所有可用分区中
        ProducerRecord<String, String> record = new ProducerRecord<>("test",
                "hello word");
//消息的异步发送
        producer.send(record, new Callback() {
            @Override
            public void onCompletion(RecordMetadata metadata, Exception exception) {
                System.out.println("消息发送完成!");
            }
        });
    }

    @After
    public void close() {
        producer.close();
    }
}

Note: There are three ways to send messages: synchronous sending, asynchronous sending, fire-and-forget (do not care about the sending result after sending)

Synchronous sending: After calling the send method, return the Future object, and synchronously wait for the sending result of the message by calling the Future's get method.

Asynchronous sending: specify a callback function when calling the send method, and the broker will call back the function after receiving the success message

fire-and-forget: After calling the send method, it does not care about the sent result processing

6.2 Consumers subscribe and consume messages

public class ConsumerClient {

    private Consumer<String, String> consumer;

    @Before
    public void init() {
        Properties props = new Properties();
        props.put("bootstrap.servers", "server1:9092,server2:9092");
        //指定消费者群组标识
        props.put("group.id", "g1");
        //key与value的反序列化器
        props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        consumer = new KafkaConsumer<>(props);
    }

    @Test
    public void consume() {
        //订阅主题为test的消息
        consumer.subscribe(Collections.singletonList("test"));
        while (true) {
            ConsumerRecords<String, String> records = consumer.poll(100);
            for (ConsumerRecord<String, String> record : records) {
                String value = record.value();
                System.out.println("接收到消息:" + value);
            }
        }
    }

    @After
    public void close() {
        consumer.close();
    }
}

7 Summary

This article mainly introduces the distributed cluster deployment method of Kafka, as well as the cluster deployment of Zookeeper, a third-party component that Kafka relies on. Finally, the Kafka Java API is used to demonstrate the sample code for producers to send messages and consumers to consume messages, and other uses of Kafka For details, please refer to the official website: http://kafka.apache.org/

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325352976&siteId=291194637