go rocketmq-Considerations for ConsumeFromTimestamp

consumer

Use rocketmq in go

When initializing the consumer, the official example is this:

c, _ := rocketmq.NewPushConsumer(
   consumer.WithGroupName("testGroup"),
   consumer.WithNsResovler(primitive.NewPassthroughResolver([]string{
    
    "127.0.0.1:9876"})),
)

From the RocketMQ source code or some books, we know that we can also specify the consumption offset.

View the official source code or the source code of the golang package, support 3 modes:

const (
    /**
     * 一个新的订阅组第一次启动从队列的最后位置开始消费<br>
     * 后续再启动接着上次消费的进度开始消费
     */
   ConsumeFromLastOffset ConsumeFromWhere = iota
   /**
     * 一个新的订阅组第一次启动从队列的最前位置开始消费<br>
     * 后续再启动接着上次消费的进度开始消费
     */
   ConsumeFromFirstOffset
   /** 
    * 一个新的订阅组第一次启动从指定时间点开始消费<br> 
    * 后续再启动接着上次消费的进度开始消费<br> 
    * 时间点设置参见DefaultMQPushConsumer.consumeTimestamp参数 */
   ConsumeFromTimestamp
)

To summarize briefly:

  • ConsumeFromLastOffset: Start from the last consumption position
  • ConsumeFromFirstOffset: Consume from the beginning, if you have consumed it, there will be a problem of repeated consumption
  • ConsumeFromTimestamp: the first start to consume from a given timestamp, subsequent starts continue from the location of the last consumption

Consumer.WithConsumeFromWhere can be used to specify in go.

c, err := rocketmq.NewPushConsumer(
   consumer.WithGroupName(groupName),
   consumer.WithConsumeFromWhere(consumer.ConsumeFromTimestamp),// default fromLastOffset
   consumer.WithNsResovler(primitive.NewPassthroughResolver(nameServers)))

If you use MQ to broadcast and push messages in IM, which mode should you use?
My personal feeling should be the third type: ConsumeFromTimestamp. Why use this mode, see below.

Why can I still consume previously produced messages in ConsumeFromTimestamp mode?

When writing the test code, the producer was first written, and several pieces of data were produced after startup. Then there were other things in the middle. After a few days, the consumer wrote and started, and I found that the previous message was printed out.

Because I use MQ to solve the problem of communication and broadcasting between different gateways (why not use route_server to transfer? Because there is a single point problem, you need to use haproxy as the main backup), the following app users send messages to web users, but these 2 users log in On different gateways, routing broadcast is bound to be required.
Insert picture description here

So I hope to consume only the messages after the current timestamp at startup. Otherwise, weird problems may arise. Why do the messages sent 1 hour ago arrive after 1 hour?

Investigate ConsumeFromTimestamp

Trying to use this mode, I found that I can still receive messages produced by the producer before. So I went back to query the documentation and found the problem:

A new subscription group starts to consume from the specified time for the first time,
and then starts to consume after the progress of the last consumption.

Therefore, only the first start time will be effective, because I use the cluster mode, consumption will be saved on the progress broker (rocketmq-console can see), broadcast mode is saved locally, so subsequent restart, then the last time The consumption progress continues to consume. So it has to be solved from the application layer.

Solution

  1. Each time it starts, the groupname adds a timestamp, so that every time it is a new subscription group, but the later maintenance is very troublesome and is not recommended
  2. Manually set the offset, but I didn't find a function in go in this way. Kafka can use this way
  3. The application layer handles itself according to the timestamp

I use the third solution, first look at the timestamp:

[root@localhost bin]# ./mqadmin queryMsgById -i 0A006BDA00002A9F0000000000000B71 -n 10.0.107.218:9876
RocketMQLog:WARN No appenders could be found for logger (io.netty.util.internal.PlatformDependent0).
RocketMQLog:WARN Please initialize the logger system properly.
OffsetID:            0A006BDA00002A9F0000000000000B71
Topic:               cim_msg_push
Tags:                [null]
Keys:                [null]
Queue ID:            0
Queue Offset:        11
CommitLog Offset:    2929
Reconsume Times:     0
Born Timestamp:      2020-07-01 10:47:01,169
Store Timestamp:     2020-06-17 05:09:02,072
Born Host:           10.0.106.117:60041
Store Host:          10.0.107.218:10911
System Flag:         0
Properties:          {
    
    UNIQ_KEY=0A006A7508CD0000000002505c880001}
Message Body Path:   /tmp/rocketmq/msgbodys/0A006A7508CD0000000002505c880001

MessageTrack [consumerGroup=group1, trackType=NOT_ONLINE, exceptionDesc=CODE:206 DESC:the consumer group[group1] not online]

Why is born timestamp and store timestamp different?

When starting, record the startTimeStamp, and then after subscribing, it will be processed when it is judged that the timestamp is greater than the timestamp.

func NewMsgConsumer() *MsgConsumer {
    
    
   return &MsgConsumer{
    
    
      pushMsgChan:    make(chan *cim.CIMPushMsg),
      startTimeStamp: time.Now().Unix() * 1000, // ms
   }
}


func (m *MsgConsumer) onMsgPush(ctx context.Context, msgs ...*primitive.MessageExt) (result consumer.ConsumeResult, err error) {
    
    
   for i := range msgs {
    
    
      logger.Sugar.Infof("subscribe callback: %v", msgs[i])

      // if msg too old,drop all
      if msgs[i].BornTimestamp < m.startTimeStamp {
    
    
         logger.Sugar.Warnf("expired msg,dorp it: %v", msgs[i])
         continue
      }

      msg := &cim.CIMPushMsg{
    
    }
      err = proto.Unmarshal(msgs[i].Body, msg)
      if err != nil {
    
    
         logger.Sugar.Info(err)
      } else {
    
    
         m.pushMsgChan <- msg
      }
   }
   return consumer.ConsumeSuccess, nil
}

Output:

2020-07-01 14:01:20.971	[INFO]	[mq/consumer.go:140]	subscribe callback: [Message=[topic=cim_msg_push, body�10.0.107.117:8000 (2�wx_2020�"$16757bf1-c9e5-4705-ba04-3b2bca925f27(����08hello mq, Flag=0, properties=map[CONSUME_START_TIME:1593583280971 MAX_OFFSET:16 MIN_OFFSET:0 UNIQ_KEY:0A006A753A940000000003022ff80001], TransactionId=], MsgId=0A006A753A940000000003022ff80001, OffsetMsgId=0A006BDA00002A9F0000000000000F49,QueueId=0, StoreSize=246, QueueOffset=15, SysFlag=0, BornTimestamp=1593583275496, BornHost=10.0.106.117:59062, StoreTimestamp=1592352409555, StoreHost=10.0.107.218:10911, CommitLogOffset=3913, BodyCRC=2070446043, ReconsumeTimes=0, PreparedTransactionOffset=0]
2020-07-01 14:01:20.971	[WARN]	[mq/consumer.go:144]	expired msg,dorp it: [Message=[topic=cim_msg_push, body�10.0.107.117:8000 (2�wx_2020�"$16757bf1-c9e5-4705-ba04-3b2bca925f27(����08hello mq, Flag=0, properties=map[CONSUME_START_TIME:1593583280971 MAX_OFFSET:16 MIN_OFFSET:0 UNIQ_KEY:0A006A753A940000000003022ff80001], TransactionId=], MsgId=0A006A753A940000000003022ff80001, OffsetMsgId=0A006BDA00002A9F0000000000000F49,QueueId=0, StoreSize=246, QueueOffset=15, SysFlag=0, BornTimestamp=1593583275496, BornHost=10.0.106.117:59062, StoreTimestamp=1592352409555, StoreHost=10.0.107.218:10911, CommitLogOffset=3913, BodyCRC=2070446043, ReconsumeTimes=0, PreparedTransactionOffset=0]

to sum up

In ConsumeFromTimestamp mode

  • Only when the subscription group is started for the first time, messages that are less than the current system timestamp are filtered out, and if the process stops or crashes later, new messages are produced. Next time you start the consumer, it will continue to consume the newly produced messages during the stop period.
  • The subsequent behavior is similar to ConsumeFromLastOffset. When restarting, because the consumption progress is saved in the broker, it continues to consume at the position before the last process exited.

Guess you like

Origin blog.csdn.net/xmcy001122/article/details/107062441
Go