Handling common problems with kafka

1. How to prevent message loss

At the producer level, we have an ack parameter confirmation mechanism

Set to -1, that is, the leader will only send ack after all replicas are synchronized. This ensures that only one of the leader and replicas can survive.

Ensure that messages are not lost

consumer:

Change automatic submission to manual submission

2. How to prevent repeated consumption

In the solution to prevent message loss, if the producer does not receive an ack due to network jitter after sending the message, the broker has actually received it. At this time, the producer will retry, so the broker will receive multiple identical messages, causing repeated consumption by the consumer.

How to deal with it:

The producer closes and retries: it will cause message loss (not recommended)
The consumer solves the non-idempotent consumption problem:
The so-called Idempotence: The result of multiple accesses is the same. For rest requests (get (idempotent), post (non-idempotent
, etc.), put (idempotent), delete (idempotent))
solution Solution:
1. Create a joint primary key in the database to prevent the same primary key from creating multiple records

Suppose we have an e-commerce platform with an order system that needs to process user orders. In this business scenario, we can use joint primary keys to avoid repeated consumption.

Assume that the order data in the order system is stored in a database table. The table structure contains the following fields: order ID, user ID, product ID, order status, etc.

The order system sends order data to other systems for processing through message queues, such as inventory systems and logistics systems. When the order system sends an order message to the inventory system, the message may fail to be sent due to network jitter or other reasons. At this time, the order system will retry.

However, due to certain reasons (such as network delay, retry mechanism design, etc.), the retry process may cause the same order message to be repeatedly sent to the inventory system. Without a way to prevent double consumption, the inventory system may process the same order multiple times, leading to inventory errors or other problems.

In order to solve this problem, we can create a joint primary key in the order data table, consisting of order ID, user ID and product ID. In this way, when the order system receives a new order, it first checks whether a record with the same joint primary key already exists in the database.

If there are duplicate records, the order system can determine that the order message has already been processed and choose to skip the processing of the duplicate message. If there are no duplicate records, the order data is inserted into the database and a message is sent to the inventory system for processing.

By using a joint primary key, we can ensure that duplicate consumption issues are prevented in the order system. Even when the order system retries, the inventory system will only process the order message received for the first time, avoiding problems caused by repeated consumption.

2. Use distributed locks and use business ID as the lock. Ensure that only one record can be created successfully

Suppose we have an online event registration system through which users can register for various activities. In this business scenario, we can use distributed locks to ensure that the same user can only successfully register for one event.

Assume that the registration records in the event registration system are stored in a database table. The table structure contains the following fields: registration ID, user ID, activity ID, registration status, etc.

When a user tries to register for an event, the system needs to do the following:

  1. Check whether the user has registered for the event.
  2. If the user has already registered for the event, the corresponding prompt will be returned to prevent the user from registering again.
  3. If the user has not registered for the event, the registration information will be inserted into the database and the registration process will be completed.

In this scenario, we can use distributed locks to ensure that the same user can only successfully register for one event. Use the user ID as the key of the lock. When a user tries to register for an event, first try to obtain the user's lock.

If the lock is obtained, it means that the user has not registered for the event and can continue to perform the registration operation and store the user ID in the distributed lock as the value of the lock.

If the lock cannot be obtained, it means that the user has already registered for the event, and a corresponding prompt can be returned to the user to prevent the user from registering again.

3. How to achieve sequential consumption of messages

  • Producer: Ensure that messages are consumed in order and messages are not lost - use synchronous sending and set ack to a value other than 0.
  • Consumer: The topic can only set one partition, and there can only be one consumer in the consumer group.

There are not many usage scenarios for sequential consumption of kafka, because performance is sacrificed, but for example, rocketmqSpecialized functions have been designed in this area.

 

4. How to solve the message backlog problem


4.1 The emergence of message backlog problem


The consumption speed of message consumers is far less than the message production speed of producers, resulting in a large amount of data in Kafka that has not been consumed. As more unconsumed data accumulates, the performance of consumer addressing will become worse and worse, eventually leading to poor performance of the entire external service provided by Kafka, resulting in slower access to other services, causing a service avalanche. .

4.2 Solution to message backlog


In this consumer, use multi-threading to make full use of the machine's performance to consume messages.
Improve the performance of business-level consumption through business architecture design.
Create multiple consumer groups and multiple consumers, deploy them to other machines, consume them together, and increase the consumption speed of consumers.
Create a consumer Otherwise, the consumer creates another topic in Kafka, with multiple partitions, and multiple partitions with multiple consumers. The consumer forwards the polled messages directly to the newly created topic without consuming them. At this time, multiple consumers of multiple partitions of the new topic begin to consume together. ——Not commonly used

5. Realize the effect ofdelay queue

5.1 Application scenarios

After the order is created, if there is no payment for more than 30 minutes, the order needs to be canceled. This scenario can be realized through the delay queue.

5.2 Specific plans

 Create the corresponding topic in kafka
The consumer consumes the message of the topic (polling)
The consumer determines the creation of the message when consuming the message Time and whether the current time exceeds 30 minutes (provided the order has not been paid)
If yes: modify the order status to canceled in the database.
If no: record the offset of the current message and stop consuming subsequent messages. After waiting for 1 minute, pull the offset and subsequent messages from Kafka again, continue to make judgments, and repeat.
 

Guess you like

Origin blog.csdn.net/txh1873749380/article/details/134891883