[Kafka] Kafka entry resolve

A, Kafka Overview

[1] Kafka Profile

Kafka is based on a 发布/订阅模式distributed message queue (Message Queue) mainly applied to
real-time processing of large data field

Message Queuing [2]

1) message queue scenarios

Here Insert Picture Description

2) Message Queue Mode

1)点对点模式(One, active pull data consumers, the clear message received message)
message producers to produce message to a Queue, Queue is then removed from the message consumer and consumer message. After the message is consumed, queue storage is no longer there, so consumers can not consume the message to the message that has been consumed. Queue supports multiple consumers, but for a message, it will only be a consumer can consume
Here Insert Picture Description
2)发布/订阅模式(one to many, after data production, pushed to all subscribers)
news producer (release) will publish messages to the topic while there are multiple messages consumers (subscription) consumption of the consumer
interest. Point and different ways to publish the news topic will be consumed all subscribers
Here Insert Picture Description

3) message queue action

1) Decoupling
allows you to independently extend or modify the process on both sides, just make sure that they comply with the same interface constraints

2) recoverability
when a portion of the components of a system failure, does not affect the entire system. Message queue reduces coupling between processes, the
order of the process, even if a hang message processing, the message is queued may still be processed after recovery system

3) buffer
help to control and optimize the speed of data flow through the system, to resolve inconsistencies production and consumption of messages message processing speed
case

4) Flexibility & peak capacity
in the case of the sharp increase in traffic, applications need to continue to play a role, but this is not common burst traffic.
If the peak is to be able to deal with such access as a standard to commit resources on standby will be enormous waste. Message Queuing enables access to critical components withstand burst pressure, but will not overload the request burst and collapse completely

5) asynchronous communication
In many cases, users do not want nor need to immediately process the message. It provides asynchronous message queue processing mechanism, allowing the user to
put a message on the queue, but does not deal with it immediately. Think how many messages are placed in a queue to put much, then go to process them in time of need

[3] Kafka architecture

Here Insert Picture Description

1) Producer: news producer, it is to kafka broker messaging client
2) Consumer: Consumer news, take a message to kafka broker client
3) Consumer Group (CG): consumer group, composed of a plurality of consumer. Consumers within each group is responsible for consumer consumption data for different partition, a partition can only be consumed by a group of consumers; simultaneously and consumer groups. All consumers belong to a consumer group, namely consumer group is a logical subscriber
4) Broker: a kafka server is a broker. A cluster composed of a plurality of broker. A broker can receive a plurality of topic
5) Topic: can be understood as a queue for producers and consumers is a topic
6) Partition: In order to achieve scalability, a very large topic may be distributed to a plurality Broker (i.e., server), the topic can be divided into a plurality of partition, each partition is an ordered queue
7) Replica: copy, when a cluster node fails guarantee, partition data on the node is not lost, and still be able to continue to work kafka, kafka provided a copy of mechanisms, each partition has a topic number a copy of a leader and a number of follower
8) leader: partition multiple copies of each "master", the producers send data objects, as well as consumer spending data are subject leader
9) follower: multiple copies of each partition in the "from" real-time synchronization of data from the leader, the leader of maintaining and synchronizing data. When the leader fails, a follower will become the new follower

Two, Kafka installation

[1] jar package download

http://kafka.apache.org/downloads.html

[2] Extracting installation package

tar -zxf kafka_2.11-0.11.0.0.tgz -C /opt/modules/
mv kafka_2.11-0.11.0.0/ kafka

[3] create logs, data directory

mkdir /opt/modules/kafka/logs
mkdir /opt/modules/kafka/logs

[4] modify the configuration file

vim /opt/modules/kafka/config/server.properties

#broker的全局唯一编号,不能重复
broker.id=0
#删除topic功能使能
delete.topic.enable=true
#处理网络请求的线程数量
num.network.threads=3
#用来处理磁盘IO的现成数量
num.io.threads=8
#发送套接字的缓冲区大小
socket.send.buffer.bytes=102400
#接收套接字的缓冲区大小
socket.receive.buffer.bytes=102400
#请求套接字的缓冲区大小
socket.request.max.bytes=104857600
#kafka运行日志存放的路径	
log.dirs=/opt/modules/kafka/data
#topic在当前broker上的分区个数
num.partitions=1
#用来恢复和清理data下数据的线程数量
num.recovery.threads.per.data.dir=1
#segment文件保留的最长时间,超时将被删除
log.retention.hours=168
#配置连接Zookeeper集群地址
zookeeper.connect=node01:2181,node01:2181,node01:2181

[5] Distribution Profile

scp -r /opt/modules/kafka 用户名@主机名:/opt/modules

Three, Kafka basic operation

1)查看当前服务器中的所有topic
bin/kafka-topics.sh --zookeeper node01:2181 --list

2)创建topic
bin/kafka-topics.sh --zookeeper node01:2181 --create --replication-factor 3 --partitions 1 --topic first
选项说明:
--topic 定义topic名
--replication-factor  定义副本数
--partitions  定义分区数

3)删除topic
bin/kafka-topics.sh --zookeeper node01:2181 --delete --topic first
注:需要server.properties中设置delete.topic.enable=true否则只是标记删除或者直接重启

4)发送消息
bin/kafka-console-producer.sh --broker-list node01:9092 --topic first

5)消费消息
bin/kafka-console-consumer.sh --bootstrap-server node01:9092 --from-beginning --topic first
选项说明:
--from-beginning:会把first主题中以往所有的数据都读取出来。根据业务场景选择是否增加该配置

6)查看某个Topic的详情
bin/kafka-topics.sh --zookeeper node01:2181 --describe --topic first

Four, Kafka production process analysis

To be continued ...

Published 70 original articles · won praise 305 · views 210 000 +

Guess you like

Origin blog.csdn.net/qq_43733123/article/details/104906413