FlinkKafkaProducer extiende TwoPhaseCommitSinkFunction implementa CheckpointFunction, CheckpointListener
TwoPhaseCommitSinkFunction implementa initializeState en
CheckpointFunction y notifyCheckpointComplete en snapshotState CheckpointListener
1. Iniciar la transacción: initializeState llama a FlinkKafkaProducer.beginTransaction () para iniciar la transacción e inicializar KafkaProducer, kafkaProducer inicializará acumularRecord y el hilo del remitente
2. Invoque el método, llame a kafkaProducer.send para escribir los datos procesados en kafka
3. Pre-commit transacción kafka: cada Cuando se activa el segundo punto de control, llame a snapshotState, llame a FlinkKafkaProducer.preCommit y luego llame al método kafkaProduer.flush, los datos restantes en el RecordAccumulator que no se han escrito en el corredor kafka se escribirán usando sender.
Nota: El mensaje se envió a kafka en los pasos 2, 3, porque beginTransaction ha iniciado el hilo del remitente y los datos en acumularRecord se enviarán a kafka. Si el nivel de aislamiento de kafkaconsumer es read_uncommitted (predeterminado), se puede leer y escritos Los datos conducen a una lectura sucia. Configúrelo en read_committed para leer los datos enviados, pero habrá una demora. El tiempo de demora es el intervalo del punto de control.
4. Envíe la transacción kafka: cuando se complete el punto de control, llame a notifyCheckpointComplete para llame a kafkaProducer.commitTransaction para enviar la transacción de kafka
TwoPhaseCommitSinkFunction
1, inicialización del punto de control initializeState
@Override
public void initializeState(FunctionInitializationContext context) throws Exception {
// when we are restoring state with pendingCommitTransactions, we don't really know whether the
// transactions were already committed, or whether there was a failure between
// completing the checkpoint on the master, and notifying the writer here.
// (the common case is actually that is was already committed, the window
// between the commit on the master and the notification here is very small)
// it is possible to not have any transactions at all if there was a failure before
// the first completed checkpoint, or in case of a scale-out event, where some of the
// new task do not have and transactions assigned to check)
// we can have more than one transaction to check in case of a scale-in event, or
// for the reasons discussed in the 'notifyCheckpointComplete()' method.
state = context.getOperatorStateStore().getListState(stateDescriptor);
boolean recoveredUserContext = false;
// 遇到故障重启时
if (context.isRestored()) {
LOG.info("{} - restoring state", name());
for (State<TXN, CONTEXT> operatorState : state.get()) {
userContext = operatorState.getContext();
List<TransactionHolder<TXN>> recoveredTransactions = operatorState.getPendingCommitTransactions();
List<TXN> handledTransactions = new ArrayList<>(recoveredTransactions.size() + 1);
for (TransactionHolder<TXN> recoveredTransaction : recoveredTransactions) {
// If this fails to succeed eventually, there is actually data loss
recoverAndCommitInternal(recoveredTransaction);
handledTransactions.add(recoveredTransaction.handle);
LOG.info("{} committed recovered transaction {}", name(), recoveredTransaction);
}
{
TXN transaction = operatorState.getPendingTransaction().handle;
recoverAndAbort(transaction);
handledTransactions.add(transaction);
LOG.info("{} aborted recovered transaction {}", name(), operatorState.getPendingTransaction());
}
if (userContext.isPresent()) {
finishRecoveringContext(handledTransactions);
recoveredUserContext = true;
}
}
}
// if in restore we didn't get any userContext or we are initializing from scratch
if (!recoveredUserContext) {
LOG.info("{} - no state to restore", name());
userContext = initializeUserContext();
}
this.pendingCommitTransactions.clear();
currentTransactionHolder = beginTransactionInternal();
LOG.debug("{} - started new transaction '{}'", name(), currentTransactionHolder);
}
beginTransactionInternal->
private TransactionHolder<TXN> beginTransactionInternal() throws Exception {
return new TransactionHolder<>(beginTransaction(), clock.millis());
}
// 开启事务
FlinkKafkaProducer.beginTransaction() 见下文
2. SnapshotState ejecuta una transacción kafka previamente comprometida en cada punto de control.
public void snapshotState(FunctionSnapshotContext context) throws Exception {
// this is like the pre-commit of a 2-phase-commit transaction
// we are ready to commit and remember the transaction
checkState(currentTransactionHolder != null, "bug: no transaction object when performing state snapshot");
long checkpointId = context.getCheckpointId();
LOG.debug("{} - checkpoint {} triggered, flushing transaction '{}'", name(), context.getCheckpointId(), currentTransactionHolder);
// 执行预提交
preCommit(currentTransactionHolder.handle);
pendingCommitTransactions.put(checkpointId, currentTransactionHolder);
LOG.debug("{} - stored pending transactions {}", name(), pendingCommitTransactions);
currentTransactionHolder = beginTransactionInternal();
LOG.debug("{} - started new transaction '{}'", name(), currentTransactionHolder);
state.clear();
state.add(new State<>(
this.currentTransactionHolder,
new ArrayList<>(pendingCommitTransactions.values()),
userContext));
}
preCommit -> FlinkKafkaProducer.preCommit
3. Envíe la transacción de kafka cuando se complete el punto de control notifyCheckpointComplete
public final void notifyCheckpointComplete(long checkpointId) throws Exception {
// the following scenarios are possible here
//
// (1) there is exactly one transaction from the latest checkpoint that
// was triggered and completed. That should be the common case.
// Simply commit that transaction in that case.
//
// (2) there are multiple pending transactions because one previous
// checkpoint was skipped. That is a rare case, but can happen
// for example when:
//
// - the master cannot persist the metadata of the last
// checkpoint (temporary outage in the storage system) but
// could persist a successive checkpoint (the one notified here)
//
// - other tasks could not persist their status during
// the previous checkpoint, but did not trigger a failure because they
// could hold onto their state and could successfully persist it in
// a successive checkpoint (the one notified here)
//
// In both cases, the prior checkpoint never reach a committed state, but
// this checkpoint is always expected to subsume the prior one and cover all
// changes since the last successful one. As a consequence, we need to commit
// all pending transactions.
//
// (3) Multiple transactions are pending, but the checkpoint complete notification
// relates not to the latest. That is possible, because notification messages
// can be delayed (in an extreme case till arrive after a succeeding checkpoint
// was triggered) and because there can be concurrent overlapping checkpoints
// (a new one is started before the previous fully finished).
//
// ==> There should never be a case where we have no pending transaction here
//
Iterator<Map.Entry<Long, TransactionHolder<TXN>>> pendingTransactionIterator = pendingCommitTransactions.entrySet().iterator();
Throwable firstError = null;
while (pendingTransactionIterator.hasNext()) {
Map.Entry<Long, TransactionHolder<TXN>> entry = pendingTransactionIterator.next();
Long pendingTransactionCheckpointId = entry.getKey();
TransactionHolder<TXN> pendingTransaction = entry.getValue();
if (pendingTransactionCheckpointId > checkpointId) {
continue;
}
LOG.info("{} - checkpoint {} complete, committing transaction {} from checkpoint {}",
name(), checkpointId, pendingTransaction, pendingTransactionCheckpointId);
logWarningIfTimeoutAlmostReached(pendingTransaction);
try {
// 提交kafka事务
commit(pendingTransaction.handle);
} catch (Throwable t) {
if (firstError == null) {
firstError = t;
}
}
LOG.debug("{} - committed checkpoint transaction {}", name(), pendingTransaction);
pendingTransactionIterator.remove();
}
if (firstError != null) {
throw new FlinkRuntimeException("Committing one of transactions failed, logging first encountered failure",
firstError);
}
}
commit ->FlinkKafkaProducer.commit
Productor FlinkKafka
1 、 beginTransaction
protected FlinkKafkaProducer.KafkaTransactionState beginTransaction() throws FlinkKafkaException {
switch (semantic) {
case EXACTLY_ONCE:
// 创建KafkaProducer
FlinkKafkaInternalProducer<byte[], byte[]> producer = createTransactionalProducer();
producer.beginTransaction();
return new FlinkKafkaProducer.KafkaTransactionState(producer.getTransactionalId(), producer);
case AT_LEAST_ONCE:
case NONE:
// Do not create new producer on each beginTransaction() if it is not necessary
final FlinkKafkaProducer.KafkaTransactionState currentTransaction = currentTransaction();
if (currentTransaction != null && currentTransaction.producer != null) {
return new FlinkKafkaProducer.KafkaTransactionState(currentTransaction.producer);
}
return new FlinkKafkaProducer.KafkaTransactionState(initNonTransactionalProducer(true));
default:
throw new UnsupportedOperationException("Not implemented semantic");
}
}
--> createTransactionalProducer -> initTransactionalProducer -> initProducer -> createProducer -> new FlinkKafkaInternalProducer
-> new KafkaProducer -> KafkaProducer
--> KafkaProducer // 以下为kafka源码
// 创建RecordAccumulator
this.accumulator = new RecordAccumulator
// 创建sender对象并启动
this.sender = newSender(logContext, kafkaClient, this.metadata);
String ioThreadName = NETWORK_THREAD_PREFIX + " | " + clientId;
this.ioThread = new KafkaThread(ioThreadName, this.sender, true);
this.ioThread.start();
2 、 invocar
public void invoke(FlinkKafkaProducer.KafkaTransactionState transaction, IN next, Context context) throws FlinkKafkaException {
// 此步就已经将消息发送到kafka,因为beginTransaction 以启动sender线程, 如果kafkaconsumer的isolation.level为
// read_uncommitted(默认),就能读到写入的数据导致脏读, 将其设置为read_committed 才能读到提交的数据,但会有延时,
// 延时时间为checkpoint间隔时间
transaction.producer.send(record, callback);
}
3 、 compromiso previo
protected void preCommit(FlinkKafkaProducer.KafkaTransactionState transaction) throws FlinkKafkaException {
switch (semantic) {
case EXACTLY_ONCE:
case AT_LEAST_ONCE:
// 刷新数据,将RecordAccumulator 中未写入完kafka broker中的数据使用sender写入完
flush(transaction);
break;
case NONE:
break;
default:
throw new UnsupportedOperationException("Not implemented semantic");
}
checkErroneous();
}
--> flush -> transaction.producer.flush() -> kafkaProducer.flush()->
/**
* Invoking this method makes all buffered records immediately available to send (even if <code>linger.ms</code> is
* greater than 0) and blocks on the completion of the requests associated with these records.
*/
public void flush() {
log.trace("Flushing accumulated records in producer.");
this.accumulator.beginFlush();
this.sender.wakeup();
try {
this.accumulator.awaitFlushCompletion();
} catch (InterruptedException e) {
throw new InterruptException("Flush interrupted.", e);
}
}
4 、 comprometerse
protected void commit(FlinkKafkaProducer.KafkaTransactionState transaction) {
if (transaction.isTransactional()) {
try {
transaction.producer.commitTransaction();
} finally {
recycleTransactionalProducer(transaction.producer);
}
}
}
commitTransaction-> kafkaProducer.commitTransaction()