一 序
上一篇在介绍producer的核心方法dosend ,send除了拦截器外,第一步就是要获取集群信息,
clusterAndWaitTime = waitOnMetadata(record.topic(), record.partition(), maxBlockTimeMs);
因为dosend本身流程较长,所以本篇主要围绕kafka的集群元数据metadata来展开。看它的作用是什么?有哪些属性?核心方法及如何实现的更新流程。《kafka源码剖析》书上是分为2.2.2 元数据metadata、2.4.4 metadataUpdater 来讲的。
二 元数据作用、属性
之前在介绍kafka 原理的时候提到过,每个topic都有多个分区,分区的副本可以分布在集群不同的broker上,从生产者角度来看这种分区的数量及副本的分布是动态变化的。而且producer在发送消息时,通常只指定了topic的名称,没有明确指定分区的编号。为了满足设计需求,producer需要知道目标分区的leader副本所在的服务器地址、端口等信息,才能建立连接,讲消息发送到kafka。因此,在kafka的producer维护了集群的元数据metadata。包括了每个topic的所有partition的信息: leader, leader_epoch, controller_epoch, isr, replicas等;
相关的类:
Node; org.apache.kafka.common 表示集群的一个节点,属性有。
/**
* Information about a Kafka node 集群节点
*/
public class Node {
private final int id;
private final String idString;
private final String host;
private final int port;
private final String rack;//机架信息
TopicPartition :表示某个topic的一个分区。
/**
* A topic name and partition number
* topic 名称跟 分区编号
*/
public final class TopicPartition implements Serializable {
private static final long serialVersionUID = -613627415771699627L;
private int hash = 0;
private final int partition;//此分区在topic中的分区编号
private final String topic;//topic 名称
PartitionInfo :分区的详细信息
public class PartitionInfo {
private final String topic;
private final int partition;
//leader 副本所在的节点
private final Node leader;
//全部副本所在的节点
private final Node[] replicas;
//ISR集合中所有副本所在的节点信息
private final Node[] inSyncReplicas;
//下线的副本所在的节点
private final Node[] offlineReplicas;
通过这三个类的组合,可以完整表示出kafka producer需要的集群元数据,保存在org.apache.kafka.common.Cluster 这个类中。
/**
* A representation of a subset of the nodes, topics, and partitions in the Kafka cluster.
* 属性字段都是用private final修饰的。只提供了查询方法。保证了对象不可变,是线程安全的。
*/
public final class Cluster {
private final boolean isBootstrapConfigured;
private final List<Node> nodes;
private final Set<String> unauthorizedTopics;
private final Set<String> invalidTopics;
private final Set<String> internalTopics;
private final Node controller;
//map ,记录TopicPartition,PartitionInfo 映射关系
private final Map<TopicPartition, PartitionInfo> partitionsByTopicPartition;
//map 记录Topic 名称,与PartitionInfo的映射关系
private final Map<String, List<PartitionInfo>> partitionsByTopic;
//map 记录Topic 名称,与PartitionInfo的映射关系(有leader副本的)
private final Map<String, List<PartitionInfo>> availablePartitionsByTopic;
//map 记录了Node id与PartitionInfo 的映射关系。按照node节点id查询
private final Map<Integer, List<PartitionInfo>> partitionsByNode;
private final Map<Integer, Node> nodesById;
private final ClusterResource clusterResource;
cluster 主要提供了不同的查询接口。方便集群元数据的查询。注意它是线程安全的。
Metadata 封装了cluster对象、listener监听器。并保存cluster数据的最后更新日期、版本号、是否需要更新等待信息。
字段如下:
public class Metadata implements Closeable {
private static final Logger log = LoggerFactory.getLogger(Metadata.class);
public static final long TOPIC_EXPIRY_MS = 5 * 60 * 1000;
private static final long TOPIC_EXPIRY_NEEDS_UPDATE = -1L;
//两次更新元数据的最小时间差(防止操作过于频繁)
private final long refreshBackoffMs;
//更新间隔。默认5分钟
private final long metadataExpireMs;
//版本号,每次更新+1.判断是否更新完成。
private int version;
//上一次更新的元数据的时间戳(包含失败)
private long lastRefreshMs;
//上一次成功更新的元数据的时间戳
private long lastSuccessfulRefreshMs;
//认证失败
private AuthenticationException authenticationException;
//记录kafka集群元数据
private Cluster cluster;
//标记是否强制更新cluster
private boolean needUpdate;
/* Topics with expiry time (topic待超时时间) */
private final Map<String, Long> topics;
//监听metadata 更新的监听器集合
private final List<Listener> listeners;
//当接收到 metadata 更新时, ClusterResourceListeners的列表
private final ClusterResourceListeners clusterResourceListeners;
// 是否强制更新所有的 metadata
private boolean needMetadataForAllTopics;
//是否需要在主题不存在的时候创建
private final boolean allowAutoTopicCreation;
// 默认为 true, Producer 会定时移除过期的 topic
private final boolean topicExpiryEnabled;
//是否关闭
private boolean isClosed;
看下metadata呗主线程调用的方法。requestUpdate()、awaitUpdate().
/**
* Request an update of the current cluster metadata info, return the current version before the update
*/
public synchronized int requestUpdate() {
//true,表示需要强制更新cluster
this.needUpdate = true;
//返回当前kafka集群元数据的版本号
return this.version;
}
/**
* Wait for metadata update until the current version is larger than the last version we know of
*/
public synchronized void awaitUpdate(final int lastVersion, final long maxWaitMs) throws InterruptedException {
if (maxWaitMs < 0)
throw new IllegalArgumentException("Max time to wait for metadata updates should not be < 0 milliseconds");
long begin = System.currentTimeMillis();
long remainingWaitMs = maxWaitMs;
//比较版本号,直到 metadata 更新成功,version 自增,还有判断没有关闭
while ((this.version <= lastVersion) && !isClosed()) {
//获取更新期间不可重复的认证错误
AuthenticationException ex = getAndClearAuthenticationException();
if (ex != null)
throw ex;
if (remainingWaitMs != 0)
//通过wait可以看出,主线程与sender通过wait/noitfy同步,更新元数据的操作交给sender线程去完成
wait(remainingWaitMs);
long elapsed = System.currentTimeMillis() - begin;
if (elapsed >= maxWaitMs)//timeout
throw new TimeoutException("Failed to update metadata after " + maxWaitMs + " ms.");
remainingWaitMs = maxWaitMs - elapsed;
}
if (isClosed())
throw new KafkaException("Requested metadata update after close");
}
requestUpdate() 主要修改needUpdate标识强制更新,这样当sender线程运行时会更新metadata。
awaitUpdate()是通过比较版本号的方式,控制数据一致性。类似乐观锁的方式。 Sender线程在更新成功元数据之前,会一直阻塞主线程。这里需要注意,metadata中的字段可以有主线程读、sender线程更新,也是通过wait/notify同步机制做到。它加上synchronized 是线程安全的。
当然这里只是看client工程的producer,实际上core工程broker也有维护MetadataCache,通过KafkaApis来获取和更新metadata信息,这里还没看,本篇不展开。待后续整理。
三 更新请求流程
我们回到本篇一开始的dosend方法调用。源码在producer。
3.1 发送请求
/**
* Wait for cluster metadata including partitions for the given topic to be available.
* @param topic The topic we want metadata for
* @param partition A specific partition expected to exist in metadata, or null if there's no preference
* @param maxWaitMs The maximum time in ms for waiting on the metadata
* @return The cluster containing topic metadata and the amount of time we waited in ms
* @throws KafkaException for all Kafka-related exceptions, including the case where this method is called after producer close
*/
private ClusterAndWaitTime waitOnMetadata(String topic, Integer partition, long maxWaitMs) throws InterruptedException {
// add topic to metadata topic list if it is not there already and reset expiry
Cluster cluster = metadata.fetch();
//校验是否topic是否是无效的,无效抛出异常
if (cluster.invalidTopics().contains(topic))
throw new InvalidTopicException(topic);
// 如果元数据不存在这个topic,则添加到元数据的topic集合中
metadata.add(topic);
//集群根据topic获取partitions数量
Integer partitionsCount = cluster.partitionCountForTopic(topic);
// Return cached metadata if we have it, and if the record's partition is either undefined
// or within the known partition range(根据cluster缓存的metadata, 直接返回ClusterAndWaitTime)
if (partitionsCount != null && (partition == null || partition < partitionsCount))
return new ClusterAndWaitTime(cluster, 0);
long begin = time.milliseconds();
// 最大的等待时间
long remainingWaitMs = maxWaitMs;
long elapsed;
// Issue metadata requests until we have metadata for the topic or maxWaitTimeMs is exceeded.
// In case we already have cached metadata for the topic, but the requested partition is greater
// than expected, issue an update request only once. This is necessary in case the metadata
// is stale and the number of partitions for this topic has increased in the meantime.
do {
log.trace("Requesting metadata update for topic {}.", topic);
metadata.add(topic);
//请求更新metadata,在更新之前返回当前版本
int version = metadata.requestUpdate();
// 唤醒sender线程
sender.wakeup();
try {// 等待元数据更新,直到当前版本大于我们所知道的最新版本
metadata.awaitUpdate(version, remainingWaitMs);
} catch (TimeoutException ex) {
// Rethrow with original maxWaitMs to prevent logging exception with remainingWaitMs
throw new TimeoutException("Failed to update metadata after " + maxWaitMs + " ms.");
}
// metadata更新完了在获取一次集群信息
cluster = metadata.fetch();
//计算耗时
elapsed = time.milliseconds() - begin;
// 如果时间超过最大等待时间,抛出更新元数据失败异常
if (elapsed >= maxWaitMs)
throw new TimeoutException("Failed to update metadata after " + maxWaitMs + " ms.");
// 如果集群未授权topics包含这个topic,也会抛出异常
if (cluster.unauthorizedTopics().contains(topic))
throw new TopicAuthorizationException(topic);
// 如果集群无效topics包含这个topic,也会抛出异常
if (cluster.invalidTopics().contains(topic))
throw new InvalidTopicException(topic);
remainingWaitMs = maxWaitMs - elapsed;
// 获取该topic的partition数量
partitionsCount = cluster.partitionCountForTopic(topic);
} while (partitionsCount == null);// 直到topic的partition数量不为空就不在执行
if (partition != null && partition >= partitionsCount) {
throw new KafkaException(
String.format("Invalid partition given with record: %d is not in the range [0...%d).", partition, partitionsCount));
}
// 返回ClusterAndWaitTime
return new ClusterAndWaitTime(cluster, elapsed);
}
主要步骤是:
1)检测metadata中是否包含了指定topic的元数据,若不包含,则将topic添加到topics的集合中,下次更新时会从服务端获取指定的topic的元数据。
2)尝试获取topic中分区的详细信息,失败那么就请求更新 metadata,如果 metadata 没有更新的话,方法就一直处在 do ... while
的循环之中。若超时抛出异常。
在循环中:
metadata.requestUpdate()
将 metadata 的needUpdate
变量设置为 true(强制更新),并返回当前的版本号(version),通过版本号来判断 metadata 是否完成更新;sender.wakeup()
唤醒 sender 线程,sender 线程又会去唤醒NetworkClient
线程,去更新metadata保存的元数据。metadata.awaitUpdate(version, remainingWaitMs)
等待sender线程完成 metadata 的更新。
在 Metadata.awaitUpdate()
方法中,线程通过wait会阻塞在 while
循环中,直到 metadata 更新成功或者 timeout。
3.2 更新metadata
那么 metadata 是如何更新的呢?前面说过主要是通过 sender.wakeup()
来唤醒 sender 线程,间接唤醒 NetworkClient 线程。在看下sender的源码,它的run方法就是调用KafkaClient.poll的方法。具体实现就是KafkaClient的子类NetworkClient。源码如下
public List<ClientResponse> poll(long timeout, long now) {
if (!abortedSends.isEmpty()) {
// If there are aborted sends because of unsupported version exceptions or disconnects,
// handle them immediately without waiting for Selector#poll.
List<ClientResponse> responses = new ArrayList<>();
handleAbortedSends(responses);
completeResponses(responses);
return responses;
}
// 判断是否需要更新 meta
long metadataTimeout = metadataUpdater.maybeUpdate(now);
try {//调用selector的poll方法
this.selector.poll(Utils.min(timeout, metadataTimeout, defaultRequestTimeoutMs));
} catch (IOException e) {
log.error("Unexpected error during I/O", e);
}
// process completed actions
long updatedNow = this.time.milliseconds();
List<ClientResponse> responses = new ArrayList<>();
handleCompletedSends(responses, updatedNow);
//处理任何已经完成的接收响应
handleCompletedReceives(responses, updatedNow);
handleDisconnections(responses, updatedNow);
handleConnections();
handleInitiateApiVersionRequests(updatedNow);
handleTimedOutRequests(responses, updatedNow);
//invoke callback
completeResponses(responses);
return responses;
}
这个方法主要步骤:
metadataUpdater.maybeUpdate(now)
:判断是否需要更新 Metadata,如果需要更新的话,先与 Broker 建立连接,然后发送更新 metadata 的请求;- 处理 Server 端的一些响应,这里主要关注是
handleCompletedReceives(responses, updatedNow)
方法,它会处理 Server 端返回的 Metadata 结果。
3.3 metadataUpdater
在介绍具体的metadataUpdater.maybeUpdate()方法之前,先说下 metadataUpdater是干啥的。
/**
* The interface used by `NetworkClient` to request cluster metadata info to be updated and to retrieve the cluster nodes
* from such metadata. This is an internal class.
* <p> 接口:有NetworkClient 调用,对应的实现类:DefaultMetadataUpdater(NetworkClient)、ManualMetadataUpdater(空实现)
* This class is not thread-safe!
*/
public interface MetadataUpdater extends Closeable {
List<Node> fetchNodes();
boolean isUpdateDue(long now);
long maybeUpdate(long now);
...
}
MetadataUpdater 是个接口,用来辅助NetworkClient更新metadata的接口。它有两个实现类:ManualMetadataUpdater(空实现)、DefaultMetadataUpdater(NetworkClient的默认实现)。
class DefaultMetadataUpdater implements MetadataUpdater {
/* the current cluster metadata 机器元数据的metadata */
private final Metadata metadata;
/* true iff there is a metadata request that has been sent and for which we have not yet received a response */
//标识是否已经发送了 metadata request变更metadata,还没收到响应
private boolean metadataFetchInProgress;
现在来看下maybeUpdate方法:
@Override
//核心方法,判断当前的metadata是否需要更新
public long maybeUpdate(long now) {
// should we update our metadata? metadata是否需要更新
// metadata 下次更新的时间(需要判断是强制更新还是 metadata 过期更新,前者是立即更新,后者是计算 metadata 的过期时间)
long timeToNextMetadataUpdate = metadata.timeToNextUpdate(now);
// 是否发送了metadataRequest请求 ,那么时间设置为 waitForMetadataFetch(默认30s)
long waitForMetadataFetch = this.metadataFetchInProgress ? defaultRequestTimeoutMs : 0;
//计算当前距离下次可以发送metadataRequest请求的时间差
long metadataTimeout = Math.max(timeToNextMetadataUpdate, waitForMetadataFetch);
if (metadataTimeout > 0) {// 时间未到,直接返回下次应该更新的时间
return metadataTimeout;
}
// Beware that the behavior of this method and the computation of timeouts for poll() are
// highly dependent on the behavior of leastLoadedNode.
// 找到负载最小的node,若没有可用的node,则返回null
Node node = leastLoadedNode(now);
if (node == null) {
log.debug("Give up sending metadata request since no node is available");
return reconnectBackoffMs;
}
//创建并缓存metadataRequest,等待下次poll()方法才会真正发送
return maybeUpdate(now, node);
}
/**
* Add a metadata request to the list of sends if we can make one
*/
private long maybeUpdate(long now, Node node) {
String nodeConnectionId = node.idString();
//检测是否允许向此node发送请求
if (canSendRequest(nodeConnectionId, now)) {
// 准备开始发送数据,将 metadataFetchInProgress 置为 true
this.metadataFetchInProgress = true;
MetadataRequest.Builder metadataRequest; // 创建 metadata 请求
if (metadata.needMetadataForAllTopics())// 强制更新所有 topic 的 metadata
metadataRequest = MetadataRequest.Builder.allTopics();
else//只更新 metadata 中的 topics 列表(列表中的 topics 由 metadata.add() 得到)
metadataRequest = new MetadataRequest.Builder(new ArrayList<>(metadata.topics()),
metadata.allowAutoTopicCreation());
log.debug("Sending metadata request {} to node {}", metadataRequest, node);
//发送 metadata Request
sendInternalMetadataRequest(metadataRequest, nodeConnectionId, now);
return defaultRequestTimeoutMs;
}
// If there's any connection establishment underway, wait until it completes. This prevents
// the client from unnecessarily connecting to additional nodes while a previous connection
// attempt has not been completed.
if (isAnyNodeConnecting()) {// 如果 client 正在与任何一个 node 的连接状态是 connecting,那么就进行等待
// Strictly the timeout we should return here is "connect timeout", but as we don't
// have such application level configuration, using reconnect backoff instead.
return reconnectBackoffMs;
}
// 如果没有连接这个 node,那就初始化连接
if (connectionStates.canConnect(nodeConnectionId, now)) {
// we don't have a connection to this node right now, make one
log.debug("Initialize connection to node {} for sending metadata request", node);
initiateConnect(node, now);
return reconnectBackoffMs;
}
// connected, but can't send more OR connecting
// In either case, we just need to wait for a network event to let us know the selected
// connection might be usable again.
return Long.MAX_VALUE;
}
如果需要更新,则发送MetadataRequest请求,发送请求之前,需要将metadataFetchInProgress设置为true.然后选择负载最小的node节点,向它发送更新请求。这里负载的大小判断是通过InFlightRequests队列中未确认的请求决定的,未确认的请求越多则认为负载越大。剩余步骤与普通的请求一样。
这里关注下更新条件,代码里有几种情况:
- 如果 node 可以发送请求,则直接发送请求;
- 如果该 node 正在建立连接,则直接返回;
- 如果该 node 还没建立连接,则向 broker 初始化链接。
所以
- sender 线程第一次调用
poll()
方法时,初始化与 node 的连接; - sender 线程第二次调用
poll()
方法时,发送Metadata
请求; - sender 线程第三次调用
poll()
方法时,获取metadataResponse
,并更新 metadata。
经过上述 sender 线程三次调用 poll()
方法,所请求的 metadata 信息才会得到更新,此时 Producer 线程也不会再阻塞,开始发送消息。
NetworkClient
接收到 MetadataResponse之后,会先调用handleCompletedReceives,方法如下:
private void handleCompletedReceives(List<ClientResponse> responses, long now) {
for (NetworkReceive receive : this.selector.completedReceives()) {//遍历已完成
String source = receive.source();
//从缓存队列获取已发送请求
InFlightRequest req = inFlightRequests.completeNext(source);
//解析返回结果
Struct responseStruct = parseStructMaybeUpdateThrottleTimeMetrics(receive.payload(), req.header,
throttleTimeSensor, now);
if (log.isTraceEnabled()) {
log.trace("Completed receive from node {} for {} with correlation id {}, received {}", req.destination,
req.header.apiKey(), req.header.correlationId(), responseStruct);
}
// If the received response includes a throttle delay, throttle the connection.
AbstractResponse body = AbstractResponse.parseResponse(req.header.apiKey(), responseStruct);
maybeThrottle(body, req.header.apiVersion(), req.destination, now);
//判断是否为MetadataResponse
if (req.isInternalRequest && body instanceof MetadataResponse)
metadataUpdater.handleCompletedMetadataResponse(req.header, now, (MetadataResponse) body);
else if (req.isInternalRequest && body instanceof ApiVersionsResponse)//ApiVersionsResponse
handleApiVersionsResponse(responses, req, now, (ApiVersionsResponse) body);
else //其他响应
responses.add(req.completed(body, now));
}
}
@Override
// 处理 Server 端对 Metadata 请求处理后的 response
public void handleCompletedMetadataResponse(RequestHeader requestHeader, long now, MetadataResponse response) {
this.metadataFetchInProgress = false;//标识
//创建Cluster对象
Cluster cluster = response.cluster();
// If any partition has leader with missing listeners, log a few for diagnosing(诊断) broker configuration
// issues. This could be a transient(临时的) issue if listeners were added dynamically to brokers.
List<TopicPartition> missingListenerPartitions = response.topicMetadata().stream().flatMap(topicMetadata ->
topicMetadata.partitionMetadata().stream()
.filter(partitionMetadata -> partitionMetadata.error() == Errors.LISTENER_NOT_FOUND)
.map(partitionMetadata -> new TopicPartition(topicMetadata.topic(), partitionMetadata.partition())))
.collect(Collectors.toList());
if (!missingListenerPartitions.isEmpty()) {
int count = missingListenerPartitions.size();
log.warn("{} partitions have leader brokers without a matching listener, including {}",
count, missingListenerPartitions.subList(0, Math.min(10, count)));
}
// check if any topics metadata failed to get updated 获取错误
Map<String, Errors> errors = response.errors();
if (!errors.isEmpty())
log.warn("Error while fetching metadata with correlation id {} : {}", requestHeader.correlationId(), errors);
// don't update the cluster if there are no valid nodes...the topic we want may still be in the process of being
// created which means we will get errors and no nodes until it exists
if (cluster.nodes().size() > 0) {//更新metadata
this.metadata.update(cluster, response.unavailableTopics(), now);
} else {//更新metadata 失败,只更新lastRefreshMs
log.trace("Ignoring empty metadata response with correlation id {}.", requestHeader.correlationId());
this.metadata.failedUpdate(now, null);
}
}
这里就是真正的更新metadata了。
/**
* Updates the cluster metadata. If topic expiry is enabled, expiry time
* is set for topics if required and expired topics are removed from the metadata.
*
* @param newCluster the cluster containing metadata for topics with valid metadata
* @param unavailableTopics topics which are non-existent or have one or more partitions whose
* leader is not known
* @param now current time in milliseconds
*/
public synchronized void update(Cluster newCluster, Set<String> unavailableTopics, long now) {
Objects.requireNonNull(newCluster, "cluster should not be null");
if (isClosed())
throw new IllegalStateException("Update requested after metadata close");
this.needUpdate = false;
this.lastRefreshMs = now;
this.lastSuccessfulRefreshMs = now;
this.version += 1;
if (topicExpiryEnabled) {
// Handle expiry of topics from the metadata refresh set.
for (Iterator<Map.Entry<String, Long>> it = topics.entrySet().iterator(); it.hasNext(); ) {
Map.Entry<String, Long> entry = it.next();
long expireMs = entry.getValue();
if (expireMs == TOPIC_EXPIRY_NEEDS_UPDATE)
entry.setValue(now + TOPIC_EXPIRY_MS);
else if (expireMs <= now) {
it.remove();
log.debug("Removing unused topic {} from the metadata list, expiryMs {} now {}", entry.getKey(), expireMs, now);
}
}
}
for (Listener listener: listeners)//如果有人监听了metadata的更新,通知他们
listener.onMetadataUpdate(newCluster, unavailableTopics);
String previousClusterId = cluster.clusterResource().clusterId();
if (this.needMetadataForAllTopics) {
// the listener may change the interested topics, which could cause another metadata refresh.
// If we have already fetched all topics, however, another fetch should be unnecessary.
this.needUpdate = false;
this.cluster = getClusterForCurrentTopics(newCluster);
} else {
this.cluster = newCluster;
}
// The bootstrap cluster is guaranteed not to have any useful information
if (!newCluster.isBootstrapConfigured()) {
String newClusterId = newCluster.clusterResource().clusterId();
if (newClusterId == null ? previousClusterId != null : !newClusterId.equals(previousClusterId))
log.info("Cluster ID: {}", newClusterId);
clusterResourceListeners.onUpdate(newCluster.clusterResource());
}
notifyAll(); //通知所有的阻塞的producer线程
log.debug("Updated cluster metadata version {} to {}", this.version, this.cluster);
}
前面说了producerwait等待metadata更新,那么这里更新完就会notify。
四 Metadata 的更新策略
Metadata 会在下面两种情况下进行更新
- KafkaProducer 第一次发送消息时强制更新,其他时间周期性更新,它会通过 Metadata 的
lastRefreshMs
,lastSuccessfulRefreshMs
这2个字段来实现; - 失效强制更新: 调用
Metadata.requestUpdate()
将needUpdate
置成了 true 来强制更新。
在 NetworkClient 的 poll()
方法调用时,就会去检查这两种更新机制,只要达到其中一种,就行触发更新操作。
那如何判定Metadata失效了呢?这个有很多地方,会判定Metadata失效。通常认为异常了就去强制更新。
initConnect
方法调用时,初始化连接;poll()
方法中对handleDisconnections()
方法调用来处理连接断开的情况,这时会触发强制更新;poll()
方法中对handleTimedOutRequests()
来处理请求超时时;- 发送消息时,如果无法找到 partition 的 leader;
- 处理 Producer 响应(
handleProduceResponse
),如果返回关于 Metadata 过期的异常,比如:没有 topic-partition 的相关 meta 或者 client 没有权限获取其 metadata。
参考:
《Apache kafka源码剖析》第二章
https://www.jianshu.com/p/bb7c332eac25