Kafka Partition分发策略

版权声明:作者:jiankunking 出处:http://blog.csdn.net/jiankunking 本文版权归作者和CSDN共有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接。 https://blog.csdn.net/xunzaosiyecao/article/details/84584528

今天突然想起一个问题,当producer往kafka写数据的时候,如果不指定Patition,也没有指定Key的话,那么它是怎么做的负载均衡?

通过查看kafka源码,发现Kafka Java客户端有默认的partition分配机制。

实现如下:

 /**
     * Compute the partition for the given record.
     *
     * @param topic The topic name
     * @param key The key to partition on (or null if no key)
     * @param keyBytes serialized key to partition on (or null if no key)
     * @param value The value to partition on or null
     * @param valueBytes serialized value to partition on or null
     * @param cluster The current cluster metadata
     */
    public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) {
        List<PartitionInfo> partitions = cluster.partitionsForTopic(topic);
        int numPartitions = partitions.size();
        if (keyBytes == null) {
            int nextValue = nextValue(topic);
            List<PartitionInfo> availablePartitions = cluster.availablePartitionsForTopic(topic);
            if (availablePartitions.size() > 0) {
                int part = Utils.toPositive(nextValue) % availablePartitions.size();
                return availablePartitions.get(part).partition();
            } else {
                // no partitions are available, give a non-available partition
                return Utils.toPositive(nextValue) % numPartitions;
            }
        } else {
            // hash the keyBytes to choose a partition
            return Utils.toPositive(Utils.murmur2(keyBytes)) % numPartitions;
        }
    }

    private int nextValue(String topic) {
        AtomicInteger counter = topicCounterMap.get(topic);
        if (null == counter) {
            counter = new AtomicInteger(ThreadLocalRandom.current().nextInt());
            AtomicInteger currentCounter = topicCounterMap.putIfAbsent(topic, counter);
            if (currentCounter != null) {
                counter = currentCounter;
            }
        }
        return counter.getAndIncrement();
    }

从源码可以看出,首先获取topic的所有Patition,如果客户端不指定Patition,也没有指定Key的话,使用自增长的数字取余数的方式实现指定的Partition。这样Kafka将平均的向Partition中生产数据。

如果想要控制发送的partition,则有两种方式,一种是指定partition,另一种就是根据Key自己写算法。继承Partitioner接口,实现其partition方法。并且配置启动参数

props.put("partitioner.class","TestPartitioner")

源码地址:https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/producer/internals/DefaultPartitioner.java
个人微信公众号:
这里写图片描述

作者:jiankunking 出处:http://blog.csdn.net/jiankunking

猜你喜欢

转载自blog.csdn.net/xunzaosiyecao/article/details/84584528