How to solve the delay and message queues expire problem? How to deal with after the message queue is full? There are millions of messages backlog continued for several hours. How to solve?

How to solve the delay and message queues expire problem? How to deal with after the message queue is full? There are millions of messages backlog continued for several hours. How to solve?

You see this question is asked, in fact, for the essence of the scene, is that it might bring out your spending problem, not a consumer, or extremely extremely slow consumption. Then they pit father, may your disk cluster message queue almost filled, nobody consumption, this time how to do? Or is this whole backlog of a few hours, this time how do you do? Or is your backlog for far too long, resulting in such rabbitmq set no news after time expired how to do?

 

So on this thing, in fact, quite common line, not general, is a big case, usually common in, for example, each time the consumer end consumer after the write mysql, mysql hung up the results, the consumer side hang there , do not move. Or consumption brought out what fork, resulting in the consumption rate is extremely slow.

On this thing, we have to sort out one by one bar, assume a scenario, we now consume brought out a failure, and then a large number of messages in the backlog mq, and now the accident, panicked

 

(1) mq large number of messages in the backlog for a few hours yet been solved

 

Tens of millions of pieces of data in the MQ backlog seven or eight hours from 16:00 more backlog to late at night, 10 points, 11 points

 

This is our true encountered a scene, is really the fault line, and this time the consumer or else fix the problem, let him recover rate of consumption, and then wait several hours consumption silly completed. This certainly can not speak during the interview.

 

1000 is a consumer one second, one second three consumers is 3000, one minute is 180 000, more than 10 million

 

So if you are a backlog of millions to tens of millions of data, even if the consumer recovery, and also takes about one hour to recover

 

Generally this time, only operate the temporary expansion of the emergency, concrete steps and ideas are as follows:

 

1) to fix consumer issues, to ensure that the recovery rate of consumption, and then stopped all existing cnosumer

2) Create a new topic, partition is 10 times the original, the original temporary establishment of a good 10 or 20 times the number of queue

3) Then write consumer program a temporary distribution data, this deployment up the backlog of data consumption, not consumption after the time-consuming process, even directly written to a temporary polling establish a good 10 times the number of queue

4) followed by temporary requisition 10 times the machines to deploy consumer, consumer consumption per batch of a temporary queue data

5) This approach is equivalent to the temporary queue resources and consumer resources to expand 10 times to 10 times normal speed consumption data

After six) and other fast finish backlog of data consumption, have to revert to the original deployment architecture, re-use the original consumer machine to consume news

 

 

(2) We assume here come the second pit

 

Suppose you are using a rabbitmq, rabbitmq can set the expiration time, that is, TTL, if the message backlog in the queue for more than a certain period of time will be rabbitmq to clean up, the data is gone. Well, this is the second pit. This is not to say that the data will be a substantial backlog in mq, but rather a large amount of data is not lost directly.

 

In this case, it is not said to increase consumer consumption backlog of messages, because in fact nothing backlog, but lost a lot of news. We can take a program that batch redirect, this line before we had a similar scenario worked. Is a large backlog, we will then discards the data, and then so after the peak of the future, such as we drink coffee together after staying up until 0:00, users sleep.

 

This time we started writing program, will be lost batch of data, write a temporary program, check out a little bit, and then re-poured mq inside, the data lost during the day and make it up to him. It can only be the case.

 

Assuming that 10,000 orders in backlog mq, there's no deal, in which 1000 orders were lost, you can only write a program manually put the 1000 order to check out the manual issued to fix a go mq

 

(3) Then let us assume that a third pit

 

If you take the way of the message backlog in mq years, so if you did not get rid of a long time, this time resulting in mq almost filled, you supposed to? There are other ways to do this? No, why did you first slow implementation of the program, you write the program temporarily, access to consumption data, consumption a discarding one, are not, and quickly consume all of the messages. Then take the second option, at night it supplemented the data.

Guess you like

Origin www.cnblogs.com/qingmuchuanqi48/p/11124116.html