[transfer] kafka introduction

Reprinted from https://www.cnblogs.com/hei12138/p/7805475.html

  1.  kafka介绍

1.1. Main functions

According to the introduction of the official website, ApacheKafka® is a distributed streaming media platform with three main functions:

  1: It lets you publish and subscribe to streams of records. Publish and subscribe to message streams. This function is similar to message queues, which is why kafka is classified as a message queue framework

  2: It lets you store streams of records in a fault-tolerant way. To record message streams in a fault-tolerant way, kafka stores message streams as files

  3: It lets you process streams of records as they occur. It can be processed when the message is published

1.2. Usage scenarios

1: Building real-time streaming data pipelines that reliably get data between systems or applications. Building reliable pipelines for transmitting real-time data between systems or applications, message queuing function

2: Building real-time streaming applications that transform or react to the streams of data. Build real-time streaming data processing programs to transform or process data streams, data processing functions

1.3. Details

Kafka is currently mainly used as a distributed publish-subscribe messaging system. The following briefly introduces the basic mechanism of Kafka

  1.3.1 Message Transmission Process

    Producer is the producer, which sends messages to the Kafka cluster. Before sending the message, it classifies the message, that is, topic. The above figure shows that two producers send messages classified as topic1, and the other one sends messages classified as topic2.

    Topic is a topic. By assigning a topic to a message, the message can be classified, and consumers can only pay attention to the messages in the topic they need.

    Consumer is the consumer. The consumer continuously pulls messages from the cluster by establishing a long connection with the kafka cluster, and then can process these messages.

    It can be seen from the above figure that the number of consumers and producers under the same topic does not correspond.

  1.3.2 Kafka server message storage strategy

    When it comes to the storage of kafka, we have to mention partitions, namely partitions. When creating a topic, you can specify the number of partitions at the same time. The more partitions, the greater the throughput, but the more resources are required, and the This leads to higher unavailability. After Kafka receives the message sent by the producer, it will store the message in different partitions according to the balancing strategy.

  Within each partition, messages are stored sequentially, with the latest received message being consumed last.

  1.3.3 Interaction with producers

    When the producer sends a message to the kafka cluster, it can send it to the specified partition by specifying the partition

    It is also possible to send messages to different partitions by specifying a balancing strategy

    If not specified, the default random balancing strategy will be used to randomly store messages in different partitions

  1.3.4 Interaction with consumers

    When consumers consume messages, Kafka uses offset to record the current consumption position

    In the design of kafka, there can be multiple different groups to consume messages under the same topic at the same time. As shown in the figure, we have two different groups to consume at the same time, and their consumption records have different offsets, not each other. interference.

    For a group, the number of consumers should not exceed the number of partitions, because in a group, each partition can only be bound to at most one consumer, that is, a consumer can consume multiple partitions, and a partition can only can be consumed by a consumer

    Therefore, if the number of consumers in a group is greater than the number of partitions, the redundant consumers will not receive any messages.

  1.  Kafka安装与使用

2.1. Download

  You can download the latest kafka installation package on the kafka official website http://kafka.apache.org/downloads, and choose to download the binary version of the tgz file. Depending on the network status, fq may be required. The version we choose here is 0.11.0.1. latest version of

2.2. Installation

  Kafka is a program written in scala that runs on a jvm virtual machine. Although it can also be used on windows, kafka basically runs on a linux server, so we also use linux here to start today's actual combat.

  First make sure that jdk is installed on your machine, kafka needs java running environment, the previous kafka also needs zookeeper, the new version of kafka has a built-in zookeeper environment, so we can use it directly

  It is said to be installed. If we only need to make the simplest attempt, we only need to extract it to any directory. Here we will extract the kafka compressed package to the /home directory

2.3. Configuration

  There is a config folder in the kafka decompression directory, where our configuration files are placed

  consumer.properites consumer configuration, this configuration file is used to configure the consumers opened in Section 2.5, here we can use the default

  producer.properties producer configuration, this configuration file is used to configure the producer enabled in Section 2.5, here we can use the default

  server.properties kafka server configuration, this configuration file is used to configure kafka server, only a few basic configurations are introduced at present

    broker.id 申明当前kafka服务器在集群中的唯一ID,需配置为integer,并且集群中的每一个kafka服务器的id都应是唯一的,我们这里采用默认配置即可
    listeners 申明此kafka服务器需要监听的端口号,如果是在本机上跑虚拟机运行可以不用配置本项,默认会使用localhost的地址,如果是在远程服务器上运行则必须配置,例如:

          listeners=PLAINTEXT://192.168.180.128:9092. And make sure the server's port 9092 can be accessed

      3.zookeeper.connect declares the address of the zookeeper to which kafka is connected, which needs to be configured as the address of the zookeeper. Since the zookeeper in the high version of kafka is used this time, the default configuration can be used.

          zookeeper.connect=localhost:2181
2.4. Run

启动zookeeper

cd into the kafka decompression directory and enter

bin/zookeeper-server-start.sh config/zookeeper.properties

After successfully starting zookeeper, you will see the following output

    2. Start kafka

cd into the kafka decompression directory and enter

bin/kafka-server-start.sh config/server.properties

After starting kafka successfully, you will see the following output

2.5. The first message

   2.5.1 Create a topic

    Kafka manages the same type of data through topics, and the same type of data can use the same topic to process data more conveniently

    Open a terminal in the kafka decompression directory and enter

    bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test

    Create a topic named test

     在创建topic后可以通过输入

        bin/kafka-topics.sh --list --zookeeper localhost:2181

to view the topics that have been created

  2.4.2 Create a message consumer

   Open a terminal in the kafka decompression directory and enter

    bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning

   You can create a consumer for consuming topic test

     消费者创建完成之后,因为还没有发送任何数据,因此这里在执行后没有打印出任何数据

     不过别着急,不要关闭这个终端,打开一个新的终端,接下来我们创建第一个消息生产者

  2.4.3 Create a message producer

    Open a new terminal in the kafka decompression directory and enter

    bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test

    The editor page that will be entered after execution

After sending the message, you can go back to our message consumer terminal. You can see that the message we just sent has been printed out in the terminal.

  1.  使用java程序

    As in the previous section, we are now trying to use kafka in our java program

    3.1 Create Topic

public static void main(String[] args) {
//创建topic
Properties props = new Properties();
props.put("bootstrap.servers", "192.168.180.128:9092");
AdminClient adminClient = AdminClient.create(props);
ArrayList topics = new ArrayList ();
NewTopic newTopic = new NewTopic("topic-test", 1, (short) 1);
topics.add(newTopic);
CreateTopicsResult result = adminClient.createTopics(topics);
try {
result.all().get();
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
}
}

  Use the AdminClient API to control the configuration of the kafka server. We use the NewTopic(String name, int numPartitions, short replicationFactor) construction method to create a "topic-test", the number of partitions is 1, and the replication factor is 1 Topic.

3.2 Producer producers send messages

public static void main(String[] args){
Properties props = new Properties();
props.put("bootstrap.servers", "192.168.180.128:9092");
props.put("acks", "all");
props.put("retries", 0);
props.put("batch.size", 16384);
props.put("linger.ms", 1);
props.put("buffer.memory", 33554432);
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");

Producer<String, String> producer = new KafkaProducer<String, String>(props);
for (int i = 0; i < 100; i++)
    producer.send(new ProducerRecord<String, String>("topic-test", Integer.toString(i), Integer.toString(i)));

producer.close();

}

After using the producer to send a message, you can listen to the message through the server-side consumer mentioned in 2.5. You can also use the java consumer program described next to consume messages

3.3 Consumer consumes messages

public static void main(String[] args){
Properties props = new Properties();
props.put("bootstrap.servers", "192.168.12.65:9092");
props.put("group.id", "test");
props.put("enable.auto.commit", "true");
props.put("auto.commit.interval.ms", "1000");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
final KafkaConsumer consumer = new KafkaConsumer (props);
consumer.subscribe(Arrays.asList("topic-test"),new ConsumerRebalanceListener() {
public void onPartitionsRevoked(Collection collection) {
}
public void onPartitionsAssigned(Collection collection) {
//Set the offset to the beginning
consumer.seekToBeginning(collection);
}
});
while (true) {
ConsumerRecords records = consumer.poll(100);
for (ConsumerRecord record : records)
System.out.printf("offset = %d, key = %s, value = %s%n", record.offset(), record.key(), record.value());
}
}

Here we use the Consume API to create a common java consumer program to listen to the topic named "topic-test". Whenever a producer sends a message to the kafka server, our consumer can receive the sent message.

  1.  使用spring-kafka

Spring-kafka is a spring sub-project that is in the incubation stage. It can use the features of spring to make it easier for us to use kafka

4.1 Basic configuration information

Like other spring projects, configuration is always inseparable. Here we use java configuration to configure our kafka consumers and producers.

引入pom文件


org.apache.kafka
kafka-clients
0.11.0.1


org.apache.kafka
kafka-streams
0.11.0.1


org.springframework.kafka
spring-kafka
1.3.0.RELEASE

创建配置类

We create a new class named KafkaConfig in the main directory

@Configuration
@EnableKafka
public class KafkaConfig {

}

配置Topic

Add configuration in kafkaConfig class

//topic config Topic configuration starts
@Bean
public KafkaAdmin admin() {
Map configs = new HashMap ();
configs.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG,"192.168.180.128:9092");
return new KafkaAdmin(configs);
}

@Bean
public NewTopic topic1() {
    return new NewTopic("foo", 10, (short) 2);
}

//topic configuration ends

配置生产者Factort及Template

//producer config start
@Bean
public ProducerFactory producerFactory() {
return new DefaultKafkaProducerFactory (producerConfigs());
}
@Bean
public Map producerConfigs() {
Map props = new HashMap ();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "192.168.180.128:9092");
props.put("acks", "all");
props.put("retries", 0);
props.put("batch.size", 16384);
props.put("linger.ms", 1);
props.put("buffer.memory", 33554432);
props.put("key.serializer", "org.apache.kafka.common.serialization.IntegerSerializer");
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
return props;
}
@Bean
public KafkaTemplate kafkaTemplate() {
return new KafkaTemplate (producerFactory());
}
//producer config end

5. Configure ConsumerFactory

//consumer config start
@Bean
public ConcurrentKafkaListenerContainerFactory kafkaListenerContainerFactory () {
ConcurrentKafkaListenerContainerFactory factory = new ConcurrentKafkaListenerContainerFactory ();
factory.setConsumerFactory(consumerFactory());
return factory;
}

@Bean
public ConsumerFactory<Integer,String> consumerFactory(){
    return new DefaultKafkaConsumerFactory<Integer, String>(consumerConfigs());
}


@Bean
public Map<String,Object> consumerConfigs(){
    HashMap<String, Object> props = new HashMap<String, Object>();
    props.put("bootstrap.servers", "192.168.180.128:9092");
    props.put("group.id", "test");
    props.put("enable.auto.commit", "true");
    props.put("auto.commit.interval.ms", "1000");
    props.put("key.deserializer", "org.apache.kafka.common.serialization.IntegerDeserializer");
    props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
    return props;
}

//consumer config end

4.2 Create a message producer

//Use spring-kafka's template to send a message and send multiple messages only need to loop multiple times
public static void main(String[] args) throws ExecutionException, InterruptedException {
AnnotationConfigApplicationContext ctx = new AnnotationConfigApplicationContext(KafkaConfig.class);
KafkaTemplate kafkaTemplate = (KafkaTemplate ) ctx.getBean("kafkaTemplate");
String data="this is a test message";
ListenableFuture > send = kafkaTemplate.send("topic-test", 1, data);
send.addCallback(new ListenableFutureCallback >() {
public void onFailure(Throwable throwable) {

        }

        public void onSuccess(SendResult<Integer, String> integerStringSendResult) {

        }
    });

}

4.3 Create a message consumer

We first create a class for message listening. When the topic named "topic-test" receives a message, our listen method will be called.

public class SimpleConsumerListener {
private final static Logger logger = LoggerFactory.getLogger(SimpleConsumerListener.class);
private final CountDownLatch latch1 = new CountDownLatch(1);

@KafkaListener(id = "foo", topics = "topic-test")
public void listen(byte[] records) {
    //do something here
    this.latch1.countDown();
}

}

     我们同时也需要将这个类作为一个Bean配置到KafkaConfig中

@Bean
public SimpleConsumerListener simpleConsumerListener(){
return new SimpleConsumerListener();
}

By default spring-kafka will create a thread for each listening method to pull messages from the kafka server

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324847141&siteId=291194637