1.1 Repeated consumption of messages
In fact, this is a common interview question. Since it is a consumption message, it must be considered whether it will be repeated consumption? Can you avoid double consumption? Or is it ok not to cause system abnormality after repeated consumption? Regarding the question of repeated consumption of messages, in fact, it is essentially asking how to ensure that you use message queues 幂等性
.
1.2 Idempotency
Idempotency (idempotent、idempotence)
is a mathematical and computing concept commonly found in abstract algebra.
The characteristic of an idempotent operation in programming is that any multiple executions of it have the same effect as a single execution . An idempotent function, or idempotent method, is a function that can be executed repeatedly with the same parameters and achieve the same result. These functions do not affect the system state, and there is no need to worry that repeated execution will cause changes to the system. For example, “setTrue()”
a function is an idempotent function, no matter how many times it is executed, the result is the same. More complex operations are guaranteed to be idempotent using unique transaction numbers (serial numbers).
In simple terms, idempotency is that a data, or a request, is executed multiple times for you, and you must ensure that the corresponding data will not change, and there will be no errors. This is idempotency .
1.3 Repeated consumption scenarios
First of all, for example rabbitmq、rocketmq、kafka
, there may be the problem of repeated consumption of messages. Because this problem is usually mq
not guaranteed by the source, but by the consumer himself.
An example kafka
to illustrate the problem of double consumption
kafka
There is a concept called offset
, that is, every message written in, has a offset
serial number representing it, and consumer
after consuming the data, every once in a while, I will offset
submit the messages I have consumed, which means that I have consumed it. Even if it restarts next time, consumers will continue to consume kafka
from the last offset
consumption.
But there are always exceptions to everything. If you consumer
consume data and hang up before sending the messages you have consumed offset
, you will receive duplicate data after restarting.
1.4 Guarantee idempotency (repeated consumption)
To ensure the idempotency of the message, this should be processed in combination with the type of business. Here are a few ideas for reference:
1. One can be maintained in memory set
. As long as a message is obtained from the message queue, first check whether the message is in set
it. If it indicates that it has been consumed, it will be discarded directly; if it is not, it will be added to it after consumption set
.
2. How to write the database, you can use the unique key to query the database first. If there is no writing, if there is, update or discard the message directly.
3. If it is written , there is redis
no problem. Every time set
, it is natural idempotency.
4. When the producer sends a message, add a global unique to each message id
, and then save the id in redis
it when consuming. When consuming, go redis
inside to check if there is any, and do not consume again.
5. The database operation can set a unique key to prevent the insertion of duplicate data, so that the insertion will only report an error without inserting duplicate data.