Big Data: Kafka Key Concepts

1kafka of iSR, on behalf of what AR

kafka need all iSR all the synchronization is complete, it indicates a successful synchronization

AR: All copy partitions

1, AR

    AR maintains a list of Kafka, including a copy of all partitions. AR is divided into ISR and OSR.

    ON = ISR + OSR.

    AR, ISR, OSR, LEO, HW this information is stored in Zookeeper in.

1.ISR

    ISR data to be synchronized in a copy of the leader of the only complete data synchronization are only considered to be successfully submitted, after successfully submitted for the outside world to visit.

    In this synchronization process, the data can not be written to even have access to the outside world, this process is by LEO-HW mechanism to achieve.

2. OSR

    Whether a copy of data synchronization within the OSR leader, does not affect the data submitted, follower in the OSR try to synchronize leader, version data may be left behind.

    Initially all the copies are in the ISR, in the process kafka work, if a copy of sync speed slower than replica.lag.time.max.ms specified threshold, were kicked out of the ISR into OSR, if the follow-up speed recovery can return to the ISR.

3.LEO

    LogEndOffset: The latest offset data partition, when data is written leader, LEO on the implementation of the latest data immediately. Equivalent to the latest data flag.

4. HW

    HighWatermark: Only the written data is synchronized to all copies of the ISR, the data is only considered to have been submitted, HW update to that location, HW data before it can be accessed by consumers, to ensure that no synchronization is complete the data will not be consumers have access to. The equivalent of all copies of data synchronization flag.

    After the leader down, only to pick a new leader from the ISR list, no matter which copy the ISR was chosen as the new leader, it knows that data before HW, can guarantee that after switching the leader, consumers can continue to watch prior to the HW already submitted data.

    Therefore, the latest data has been written on behalf of LEO position, and HW said that it has completed the synchronization of data, only the data before HW access to the outside world.

5. HW cut-off mechanism

    If the leader is down, elected a new leader, and the new leader does not guarantee that all the data has been fully synchronized before the leader, only to ensure that the data is synchronized over the previous HW, in which case all of the data to be follower HW truncated to the position, and then the new leader synchronization data to ensure data consistency.

    When the recovery downtime leader, found a new leader in data inconsistencies and data it holds, then cut down the leader will hw own data to the position before the downtime, and then synchronize the data for the new leader. Downtime leader alive, like follower, like data synchronization to ensure data consistency.

kafka storage mechanism

    kafka be divided by topic theme store data, there are partitions within the theme, the partition can have multiple copies of internal partitioning further subdivided into several segment.

    The so-called partition file is actually created in kafka the corresponding storage directory folder, the folder name is the subject name plus the partition number, numbered from zero.

1、segment

    File so-called segment is actually produced under the partition corresponding folder.

    A partition segment is divided into a number of equal size, so that on the one hand to ensure that the data partition is divided into multiple files to ensure no oversized documents; on the other hand may be based on historical data of these files are deleted segment ,Improve efficiency.

    And a segment of a .log and a .index files.

1..log

    .log file as a data file used to store the data segment data.

2..index

    .index hold index information for the corresponding .log file is an index file.

    In .index file, save the index information corresponding to the file .log, each offset may be learned is stored in the current segment start position .log file by looking .index file, each log has a fixed format , including relevant information stored offset number, log length, key length, etc., the data in the fixed format may determine the current position of the end of the offset, thereby reading the data.

3. Naming Rules

    Naming these two files are:

    a first global partition segment from zero, for each subsequent segment file named file on a segment offset value of the last message, 64-bit numerical values, 20 numeric characters in length, with no digital zeros.

2, the read data

    At the start of reading the specified partition data corresponding to an offset, according to the first segment of the names of all offset and compares the current partition, determine in which segment data, and then find the segment of the index file, determine the current offset in the data file in the starting position, and finally to start reading the data file from the location, the judgment result in the data format for a complete data.

 

2, producers reliability level

    Through the above explanation, it has been possible to ensure the reliability of internal kafka cluster, but when sending the producer to kafka cluster, data transmission through the network, is not reliable, may result in loss of data because of network latency, flash and so on.

    kafka offers three levels of reliability as to producers through different strategies to ensure the reliability of different safeguards.

    In fact, this policy configuration is successful leader will receive the message information in response to the client's time.

    By request.required.acks parameters:

    1: Producers send data to the leader, the leader successfully transmitted information is received data, the producer received after sending data considered successful, if success has not received the message, the producer failed to send data automatically considered to retransmit the data.

    When the leader goes down, you may lose data.

    0: Manufacturer stop sending data to the leader, leader feedback without success message.

    This mode is the highest efficiency, lowest reliability. Data may be lost during transmission, you may lose data when the leader is down.

    -1: producer sends data to the leader, the leader receives the data until all copies of the ISR list are synchronized data is complete, before successfully sending a message to the producer, if not receive a success message, then that sends data failure will automatically retransmit the data.

    In this mode very high reliability, but when the ISR list only the leader, when the leader let down though there might lose data.

    Can configure the ISR observed min.insync.replicas specified requirements must be at least the number of copies specified, the default value is 1, it is necessary to a value of 2 or greater

    So that when producers send data to the leader but found only ISR leader himself, receive an exception indicates that the data write fails, then the data can not be written to ensure that the data is definitely not lost.

    Although it is not lost but may produce redundant data, such as producers send data to the leader, leader synchronous data to the ISR follower, leader synchronized to the half down, this time to elect a new leader, may have part of this submission data, while the producers failure message is received retransmission data, the new data is the data receiving leader repeated.

3, Leader election

    When a follower ISR will choose the time down to become the new leader of the leader, if all copies of the ISR are down, how do?

    It has the following configuration can solve this problem:

    unclean.leader.election.enable=false

    Strategy 1: ISR must wait for a copy of the list alive before choosing to become leader of its continued work.

    unclean.leader.election.enable=true

    Strategy 2: Select any one of the copies survived, becoming the leader to continue to work, this may not be the ISR follower.

    Strategy 1, the reliability is guaranteed, but low availability, only the last hanging leader alive kafka to recover.

    Strategy 2, high availability, reliability, there is no assurance that any copy alive can continue to work, but there may be data inconsistencies.

4, to ensure the reliability of kafka

    At most once: message may be lost, but will not repeat transmission.

    At least once: no messages will be lost, but may be repeated transmission.

    Exactly once: Each message will certainly be transmitted once and only once transmission.

    kafka up to guarantee At least once, can guarantee not lost, but may be repeated in order to resolve duplicate the need to introduce mechanisms to uniquely identify and de-emphasis, kafka provides a unique identification GUID achieved, but does not provide built-in deduplication mechanism needs developers based on business rules to their own weight.

 

 

What is 1.Kafka ISR, AR, they also represent?

    ISR: keep the follower collection synchronized with leader

    AR: All copy partitions

The 2.Kafka HW, LEO represent so what?

    LEO: No copy of the final message of the offset

    HW: a partition minimal offset all copies

3.Kafka is reflected in the order of how the news?

    Each partition, and each message has an offset, it can only ensure the orderly partition.

4.Kafka in the district, a serializer, interceptor whether understand? What is the processing order between them?

    Blocker -> serializers -> partitioner

5.Kafka the overall structure of the producer client look like? Use several threads to handle? What are they?

 

6. "The number of consumer spending in the group if the topic of partition over, then there will be less consumer spending data," this sentence is correct?

    correct

7. Consumers submitted when submitting the consumption displacement or offset current consumption is offset to the latest news of + 1?

    offset+1

8. What circumstances would cause duplication of spending?

 

9. Those scenes will cause the leak news consumption?

    To submit offset, post-consumer, may cause duplication of data

10. When you use kafka-topics.sh create (deletion) of a topic, what is the logic behind Kafka would perform?

    1) creates a new topic in the next node in the zookeeper / brokers / topics node, such as: / brokers / topics / first

    2) Trigger Controller listeners

    3) kafka Controller is responsible for creating work topic and update the metadata cache

Can you increase the number of partitions 11.topic? If you can increase how? If not, that is why?

can increase

bin/kafka-topics.sh --zookeeper localhost:2181/kafka --alter --topic topic-config --partitions 3

You can reduce the number of partitions 12.topic? If you can reduce how? If not, that is why?

    Not decrease, partition deleted data difficult to handle.

13.Kafka have topic inside it? What if that? What's the use?

    __consumer_offsets, save consumers offset

14.Kafka partition allocation concept?

    A plurality of partitions topic, a plurality of consumers consumer, so the consumer needs to be assigned a partition (roundrobin, range)

15. The log directory structure outlined Kafka's?

    Each partition corresponds to a folder, the folder name as topic-0, topic-1, and internally .log file .index

16. If I specify an offset, Kafka Controller how to find the corresponding message?

   

17. The role of chat Kafka Controller?

    The broker is responsible for managing the cluster offline, copies of all partitions assigned topic and leader election and so on.

18.Kafka in those places need elections? These local elections and what strategy?

    partition leader (ISR), controller (first come first served)

19. What is meant by a copy of the failure? What are the responses?

    It can not be synchronized in time with the leader, being kicked out of the ISR, such as its leader after the catch and then rejoin

Those design 20.Kafka let it have such a high performance?

    Partition, sequential write disk, 0-copy

Server.properties all configuration parameters (for explanation) the list below:

 

 

parameter

Description (explain)

broker.id =0

Each broker uniquely represented in the cluster, the requirement is a positive number. When the IP address of the server is changed, broker.id does not change, the situation will not affect consumers of news

log.dirs=/data/kafka-logs

Kafka storage address data, a plurality of address words separated by commas / data / kafka-logs-1, / data / kafka-logs-2

port =9092

broker server service port

message.max.bytes =6525000

It represents the maximum size of the message body, in bytes

num.network.threads =4

The maximum number of threads broker processing a message, generally do not need to modify

num.io.threads =8

The number of threads broker disk IO processing, value should be greater than the number of your hard drive

background.threads =4

Some background task processing threads, for example, delete outdated message file, under normal circumstances do not need to modify

queued.max.requests =500

IO Threading waiting queue maximum number of requests, if the request is waiting for IO exceeds this value, it will stop accepting external message, it should be a self-protection mechanism.

host.name

broker host address, if set up, it will bind to this address, if not, it will bind to all interfaces, and send one of them to ZK, generally do not set

socket.send.buffer.bytes=100*1024

The socket transmit buffer, socket tuning parameters SO_SNDBUFF

socket.receive.buffer.bytes =100*1024

the socket receive buffer, socket tuning parameters SO_RCVBUFF

socket.request.max.bytes =100*1024*1024

socket requested maximum value, when the cover to prevent the specified parameters serverOOM, message.max.bytes bound to less than socket.request.max.bytes, will be created topic

log.segment.bytes =1024*1024*1024

topic partition based on a bunch of files stored in the segment, the control of the size of each segment, covering the specified parameters will be created when the topic

log.roll.hours =24*7

This parameter does not reach the size of log.segment.bytes set in the log segment, covering the time specified parameters will be forced to create a new segment will be created topic

log.cleanup.policy = delete

Log cleanup strategy choose: delete expired and compact mainly for data processing, or log file reaches the limit on the amount of specified parameters will be created when the topic covered

log.retention.minutes=3days

The maximum data storage time exceeds this time will be set according to the policy process data log.cleanup.policy, that is, how long can the consumer end to consumption data

log.retention.bytes and log.retention.minutes any one to meet the requirements, will perform the deletion, specify the parameters will be created when the topic covered

log.retention.bytes=-1

The maximum file size for each partition of the topic, a topic size limit = number of partitions * log.retention.bytes. -1 no size limit log.retention.bytes and log.retention.minutes any one to meet the requirements, will perform the removal, will be covered by the parameters specified when the topic is created

log.retention.check.interval.ms=5minutes

Check the file size of the cycle time, whether the punishment set in the policy log.cleanup.policy

log.cleaner.enable=false

Log compression is turned on

log.cleaner.threads = 2

The number of threads running log compression

log.cleaner.io.max.bytes.per.second=None

Log Compression dealing with the maximum size

log.cleaner.dedupe.buffer.size=500*1024*1024

Log Compression deduplication cache space when, in the case of space permits, the bigger the better

log.cleaner.io.buffer.size=512*1024

General IO block sizes when used without modifying the log cleanup

log.cleaner.io.buffer.load.factor =0.9

Log in to clean up the hash table expansion factors generally do not need to modify

log.cleaner.backoff.ms =15000

Check whether the punishment log cleaning intervals

log.cleaner.min.cleanable.ratio=0.5

Log cleanup frequency control, the bigger means more efficient clean-up, while there will be some waste of space, covering the specified parameters will be created when the topic

log.cleaner.delete.retention.ms =1day

The maximum time reserved for compressed log, the message is consumed by the client the maximum time difference with log.retention.minutes that a control uncompressed data, a data compression control. Will be covered by the specified parameters to create the topic

log.index.size.max.bytes =10*1024*1024

For the segment of the index log file size limit, specify the parameters will be overwritten when the topic is created

log.index.interval.bytes =4096

When performing a fetch operation, a certain space to the nearest offset scan size, the larger set, represents the faster the scan rate, but also a better memory, generally you do not need this parameter Dali

log.flush.interval.messages=None

log文件”sync”到磁盘之前累积的消息条数,因为磁盘IO操作是一个慢操作,但又是一个”数据可靠性"的必要手段,所以此参数的设置,需要在"数据可靠性"与"性能"之间做必要的权衡.如果此值过大,将会导致每次"fsync"的时间较长(IO阻塞),如果此值过小,将会导致"fsync"的次数较多,这也意味着整体的client请求有一定的延迟.物理server故障,将会导致没有fsync的消息丢失.

log.flush.scheduler.interval.ms =3000

检查是否需要固化到硬盘的时间间隔

log.flush.interval.ms = None

仅仅通过interval来控制消息的磁盘写入时机,是不足的.此参数用于控制"fsync"的时间间隔,如果消息量始终没有达到阀值,但是离上一次磁盘同步的时间间隔达到阀值,也将触发.

log.delete.delay.ms =60000

文件在索引中清除后保留的时间一般不需要去修改

log.flush.offset.checkpoint.interval.ms =60000

控制上次固化硬盘的时间点,以便于数据恢复一般不需要去修改

auto.create.topics.enable =true

是否允许自动创建topic,若是false,就需要通过命令创建topic

default.replication.factor =1

是否允许自动创建topic,若是false,就需要通过命令创建topic

num.partitions =1

每个topic的分区个数,若是在topic创建时候没有指定的话会被topic创建时的指定参数覆盖

 

 

以下是kafka中Leader,replicas配置参数

 

controller.socket.timeout.ms =30000

partition leader与replicas之间通讯时,socket的超时时间

controller.message.queue.size=10

partition leader与replicas数据同步时,消息的队列尺寸

replica.lag.time.max.ms =10000

replicas响应partition leader的最长等待时间,若是超过这个时间,就将replicas列入ISR(in-sync replicas),并认为它是死的,不会再加入管理中

replica.lag.max.messages =4000

如果follower落后与leader太多,将会认为此follower[或者说partition relicas]已经失效

##通常,在follower与leader通讯时,因为网络延迟或者链接断开,总会导致replicas中消息同步滞后

##如果消息之后太多,leader将认为此follower网络延迟较大或者消息吞吐能力有限,将会把此replicas迁移

##到其他follower中.

##在broker数量较少,或者网络不足的环境中,建议提高此值.

replica.socket.timeout.ms=30*1000

follower与leader之间的socket超时时间

replica.socket.receive.buffer.bytes=64*1024

leader复制时候的socket缓存大小

replica.fetch.max.bytes =1024*1024

replicas每次获取数据的最大大小

replica.fetch.wait.max.ms =500

replicas同leader之间通信的最大等待时间,失败了会重试

replica.fetch.min.bytes =1

fetch的最小数据尺寸,如果leader中尚未同步的数据不足此值,将会阻塞,直到满足条件

num.replica.fetchers=1

leader进行复制的线程数,增大这个数值会增加follower的IO

replica.high.watermark.checkpoint.interval.ms =5000

每个replica检查是否将最高水位进行固化的频率

controlled.shutdown.enable =false

是否允许控制器关闭broker ,若是设置为true,会关闭所有在这个broker上的leader,并转移到其他broker

controlled.shutdown.max.retries =3

控制器关闭的尝试次数

controlled.shutdown.retry.backoff.ms =5000

每次关闭尝试的时间间隔

leader.imbalance.per.broker.percentage =10

leader的不平衡比例,若是超过这个数值,会对分区进行重新的平衡

leader.imbalance.check.interval.seconds =300

检查leader是否不平衡的时间间隔

offset.metadata.max.bytes

客户端保留offset信息的最大空间大小

kafka中zookeeper参数配置

 

zookeeper.connect = localhost:2181

zookeeper集群的地址,可以是多个,多个之间用逗号分割hostname1:port1,hostname2:port2,hostname3:port3

zookeeper.session.timeout.ms=6000

ZooKeeper的最大超时时间,就是心跳的间隔,

若是没有反映,那么认为已经死了,不易过大

zookeeper.connection.timeout.ms =6000

ZooKeeper的连接超时时间

zookeeper.sync.time.ms =2000

ZooKeeper集群中leader和follo

 

发布了131 篇原创文章 · 获赞 27 · 访问量 32万+

Guess you like

Origin blog.csdn.net/wangjunji34478/article/details/104370320