How to ensure kafka transaction synchronization between a producer and consumer application separately, regarding failure points?

Lazaro R. :

I am still a bit new to Spring-Kafka/Kafka in general. My question is rather brief. I have a consumer only application that reads from Kafka continually, processes messages, and acknowledges them manually by using an Ack Listener. I have dependencies from an upstream producer-only application in which they are the ones in charge of sending messages to the Kafka topics in order for me to consume. We have recently implemented transactions across producer and consumer but I wanted to understand more about failure points and how to handle those transactions that are rolled back so they are not lost? I have read that it is best to use AfterRollbackProcessor instead of SeekToCurrentErrorHandler for transactions on the kafka container factory, along with StatefulRetry being set to true. the reason I am using transactions is to achieve the exactly-once Kafka semantics in their newer release because we deal with a lot of database persistence and cannot afford duplicate transactions due to DB constraints. I was wondering if my @KafkaListener had to be annotated with @Transactional because I had read a post before stating that this should not be the case but other posts that this might be the case which is why I am unsure. I have seen many questions about a producer AND consumer application but I have not seen one about separate applications with those separate roles respectively (even if it might be the same thing at the end of the day). In a nutshell, I just wanted to know what are best practices when incorporating transactions with Kafka and how to handle failures in that case.

Gary Russell :

Kafka transactions are an unnecessary overhead for consumer-only applications. Transactions are only useful when producing records.

I am using transactions is to achieve the exactly-once Kafka semantics in their newer release because we deal with a lot of database persistence and cannot afford duplicate transactions due to DB constraints.

There is no guarantee for "exactly once" when other technologies are involved. Exactly once only applies to

read->process->write

scenarios where read and write are Kafka. This is a common misunderstanding.

Also, even with kafka-only read/process/write, the "exactly once" semantics apply to the whole shebang only. i.e., the offset of the read is only committed if the write is successful.

The process step will get at least once semantics so you need de-duplication logic whenever you are writing elsewhere in the process step, regardless of whether there is a Kafka write step and (if there is a Kafka write) you are using transactions for exactly once there.

For cases where you are reading from Kafka and writing to a DB, with no write to Kafka, @Transactional on the listener is the right approach (with de-dup logic to avoid duplicates).

For cases where you want exactly once Kafka semantics (read/process/write) but also write to a DB in the process step, you can use a ChainedKafkaTransactionManager in the listener container so the DB transaction is synchronized with the Kafka transaction (but there is still a small window for cases where the DB commit succeeds, but the Kafka transaction fails). So you still need de-dup logic, even then. In that case, you don't want @Transactional on the listener.

EDIT

Producer-only is a bit different; let's say you want to publish 10 records in a transaction, you want them all to be in (committed) or out (rolled back). You must use transactions then.

Consumers of records produced in transactions should have isolation.level=read_committed so they don't see uncommitted writes (default is read_uncommitted).

If you are only publishing single records at a time, and there is no other transactional resource involved, there is little point in using transactions if only Kafka is involved.

If, though, you are reading from a DB, or JMS etc, and writing to Kafka, you would probably want to synchronize the DB and Kafka transactions but, again, the probability of duplicates is still non-zero; how you deal with that depends on the order in which you commit the transactions.

Typically, de-duplication is application-dependent; often some key in the application data is used so, for example, a SQL INSERT statement is made conditional on that key not already existing in the DB.

Kafka provides a convenient unique key for each record, with the combination of topic/partition/offset. You could store those in the DB along with the data to prevent duplicates.

EDIT2

SeekToCurrentErrorHandler (STCEH) is usually used when NOT using transactions; when the listener throws an exception, the error handler resets the offsets so the record is refetched on the next poll. After some number of attempts, we give up and call a "recoverer", such as the DeadLetterPublishingRecoverer to write the failed record to another topic.

It can still be used with transactions, however.

The error handler is called within the scope of the transaction (before the rollback) so, if it throws an exception (which it does unless the recoverer "consumes" the failure), the transaction will still roll back. If the recovery is successful, the transaction will commit.

The AfterRollbackProcessor (ARP) was added before recovery capability was added to the STCEH. It essentially does exactly the same as the STCEH, but it runs outside of the scope of the transaction (after the rollback).

Configuring both won't hurt anything because there will be nothing for the ARP to do if the STCEH has already performed the seeks.

I still prefer to use the ARP with transactions and STCEH without - if only to get appropriate log categories for log messages. There may be other reasons that I can't think of right now.

Note that, now that retry and back-off is supported in both the STCEH and ARP, there is no need to configure listener-level stateful retry at all. Stateless retry might still be useful if you want to use in-memory retries without causing round trips to the broker to re-fetch the same record(s).

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=358651&siteId=1