kafka Glossary and principles of the resolution process

One. Kafka 's Glossary

  1.topic ( Theme )

 topic corresponds to a system message queue MQ queue in conventional, message producer sends must be sent to specify on which topic. In a large application system, according to different functions, to distinguish between different topic (topic order, login topic, the amount of the topic, etc.)

2. partition ( partition )

 

Below a topic may have a plurality of partition, kafka after receiving the message, the message will be carried out in accordance with load blance (hask (message)% uniformity of the distribution of this message [broker_num]) on a different partition. 

The number of partition disposed generally consistent with the number of clusters to kafka (i.e. the number of broker)

3.partition replica ( partition copy )

 

partition is a partition replica copies of the data, in order to prevent an optimized data loss, partition and replica are not on the same broker at. The number and the number of partition Replica of consistency to achieve high availability

4.broker

Kafka node, a node is Kafka after a broker, may be composed of a plurality of broker cluster .brokerid Kafka IP 3 indicates the general

5. Segment

Partition the physical structure can be divided into a plurality of segment, the message information is stored for each segment

6.producer

Production message, sent to the topic

7.consumer

Subscribed to the specified topic, topic above message consumer information

8.Consumer group

It may be composed of a plurality of consumer consumer group

two. And the name of the principle of interpretation

1.partition

kafka's message is a key- value pairs, or only the topic and value when there is no key is null default is assigned a key in most cases, there are two aspects of this key information:

   1 . metadata information

   2 . help partition partition, as this key route, the same batch of data written on a partition 

a message is a producer record (production records) object, it must have included the topic and value these two parameters, and partition is key the absence of 

all of the message is the same key, will be assigned to the same partition 

when a key is null when it uses the default partition, this effect is that it will partition random key corresponding to this the producer record into a prtition wherein, the data as much as possible so that a uniform distribution on the topic in order to prevent data skew 

if the display of a specified key, then it will be based on the partition key value of this hash, then according to the partition modulo number, message store to determine which partition on the topic 

Let's do a test: when the message has a key deposit and no key How to send data to the location of the partition?

    When the message has a key presence deposit

/**
 * 
 * @des        测试kafka partition 分区信息                              
 * @author  zhao
 * @date    2019年6月27日上午12:17:55
 *
 */
public class PartitionExample {
    
    private final static  Logger LOG = LoggerFactory.getLogger(PartitionExample.class);
    
    public static void main(String[] args) throws InterruptedException, ExecutionException {
        
        Properties properties = initProp();
        KafkaProducer<String, String> producer = new KafkaProducer<String, String>(properties);
        ProducerRecord<String, String> record = new ProducerRecord<String, String>("test_partition","appointKey","hello");   //指定key时
        Future<RecordMetadata> future = producer.send(record);
        RecordMetadata recordMetadata = future.get();
        LOG.info(">>>>>>>>>>>>>>>>>> {}",recordMetadata.partition());
        
        record = new ProducerRecord<String, String>("test_partition","appointKey","world");
        future = producer.send(record); recordMetadata = future.get();
        LOG.info(">>>>>>>>>>>>>>>>>> {}",recordMetadata.partition());
         
        producer.flush();
        producer.close();
        System.out.println("====================================");
    }
    
    private static Properties initProp() {
        Properties prop = new Properties();
        prop.put("bootstrap.servers", "192.168.199.11:9092,192.168.199.12:9092,192.168.199.13:9092");
        prop.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        prop.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        
        return prop;
    }
}

/ As can be seen from the log is sent to randomly partition the

22:21:06.231 [main] INFO com.zpb.kafka.PartitionExample - >>>>>>>>>>>>>>>>>> 1

22:21:06.258 [main] INFO com.zpb.kafka.PartitionExample - >>>>>>>>>>>>>>>>>> 0

  When the message is stored in non-key presence

/**
 * 
 * @des        测试kafka partition 分区信息                              
 * @author  zhao
 * @date    2019年6月27日上午12:17:55
 *
 */
public class PartitionExample {
    
    private final static  Logger LOG = LoggerFactory.getLogger(PartitionExample.class);
    
    public static void main(String[] args) throws InterruptedException, ExecutionException {
        
        Properties properties = initProp();
        KafkaProducer<String, String> producer = new KafkaProducer<String, String>(properties);
        ProducerRecord<String, String> record = new ProducerRecord<String, String>("test_partition", "hello");
        Future<RecordMetadata> future = producer.send(record);
        RecordMetadata recordMetadata = future.get();
        LOG.info(">>>>>>>>>>>>>>>>>> {}",recordMetadata.partition());
        
        record = new ProducerRecord<String, String>("test_partition","world");
        future = producer.send(record); recordMetadata = future.get();
        LOG.info(">>>>>>>>>>>>>>>>>> {}",recordMetadata.partition());
         
        producer.flush();
        producer.close();
        System.out.println("====================================");
    }
    
    private static Properties initProp() {
        Properties prop = new Properties();
        prop.put("bootstrap.servers", "192.168.199.11:9092,192.168.199.12:9092,192.168.199.13:9092");
        prop.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        prop.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        
        return prop;
    }
}

// can be seen from the log is sent to the same partition of

 
 

22:29:29.963 [main] INFO com.zpb.kafka.PartitionExample - >>>>>>>>>>>>>>>>>> 2

 
 

22:29:29.969 [main] INFO com.zpb.kafka.PartitionExample - >>>>>>>>>>>>>>>>>> 2

Through the above test results: 
  when a key or a group key mapping the same partition, all the partition must calculate the mapping relationship, does not necessarily mean that the available partition, because multiple partition, when a partition hang , to take part in the calculation, which means that when you write data, while if it is sent to hang on this partition, will fail to send only one consumer client read a partition in which a conusmer group inside, not there may be a plurality of the same group which reads a plurality of partition consumer

 

Guess you like

Origin www.cnblogs.com/MrRightZhao/p/11094707.html