rocketmq Best Practices

Best Practices


1 producer

Send Message Notes 1.1

1 Tags use

A Topic with a possible application, the message subtype can be identified using tags. tags can be set free by the application, the only producer to send a message to set up tags, only when the consumer can subscribe message broker through the use of tags do message filters: message.setTags ( "TagA").

2 Keys to use

Each message a unique identification code to be set to the operational level of keys field, facilitate future positioning information loss problem. The server creates an index (hash index) for each message, the application by topic, key to query the contents of this news, and news consumption by whom. Because it is a hash index, make sure that the only key as possible, to avoid potential conflicts of hash.

   // 订单Id   
   String orderId = "20034568923546";   
   message.setKeys(orderId);   

Print 3 log

Message sent successfully or failed to print message logs, be sure to print SendResult and key fields. send messages as long as the method does not throw an exception, on behalf sent successfully. Send success have multiple states, defined in sendResult years. The following is described for each state:

  • SEND_OK

Message sent successfully. Note that the message is sent successfully does not mean that it is reliable. To ensure that no messages are lost, you should also enable the synchronization Master server or synchronize brush plate, namely SYNC_MASTER or SYNC_FLUSH.

  • FLUSH_DISK_TIMEOUT

However, the success message server timeout brush disc. The message has entered the server queue (memory), only the server is down, the message will be lost. Message storing configuration parameters can be set brush disc mode and synchronous brush length of the extent of time, if Broker server set brush disc for the synchronous brush plate, i.e. FlushDiskType = SYNC_FLUSH (default asynchronous brush plate mode), when Broker server not synchronized brush time the disk (default 5s) the brush plate is completed, the status will be returned - the brush plate timeout.

  • FLUSH_SLAVE_TIMEOUT

Message sent successfully, but the server timeout synchronized to Slave. The message has entered the server queue, only the server is down, the message will be lost. If the role of the server is synchronized Master Broker, i.e. SYNC_MASTER (i.e. default Asynchronous Master ASYNC_MASTER), Broker and from the server is not time synchronized brush disc (default 5 seconds) to complete synchronization with the primary server, it returns the status - - Slave data synchronization to the server timed out.

  • SLAVE_NOT_AVAILABLE

Message sent successfully, but this time Slave unavailable. If the role is synchronized Master Broker server, namely SYNC_MASTER (default is asynchronous Master Server that is ASYNC_MASTER), but is not configured slave Broker server, this state will be returned - No Slave server is available.

Message transmission failure processing mode 1.2

Producer support itself inside the send method retry, retry logic is as follows:

  • 2 to test multiple times (twice for the synchronous transmission, asynchronous transmission is 0 times).
  • If the transmission fails, the next rotation Broker. This method is time consuming total does not exceed the value of the sendMsgTimeout set default 10s.
  • If the broker sends a message to itself, a timeout exception, will not retry.

Above strategy is also to some extent to ensure that the messages can be sent successfully. If the message traffic is relatively high reliability, a corresponding increase in the proposed application retry logic: send synchronous method call such as when it fails, the message is stored in the DB attempt, then the retry timer background thread, must ensure that the message reaches the Broker.

Db retry the above manner Why not integrated into the internal MQ client to do, but requires its own application to complete, based on the following considerations: First, MQ client designed to stateless mode, convenient arbitrary horizontal expansion, and to machine resources consumed only cpu, memory, network. Secondly, if the internal MQ client integration KV a storage module, the data off the disk to sync only more reliable, but the disk itself off synchronization performance overhead is large, usually asynchronous place orders, and because the application process is not closed MQ transport control and maintenance personnel, may often happen this way kill -9 violence shut down, resulting in data lost no time to place orders. Third, the low reliability of the machine where the Producer, usually a virtual machine is not suitable for storing important data. In summary, the proposed application referred to retry process control.

1.3 oneway selected transmission form

Message is typically transmitted such a process:

  • The client sends a request to the server
  • Server processes the request
  • The server returns a response to the client

Therefore, a message transmission time consuming is the sum of the above three steps, and some scenes in claim consuming very short, but the reliability is not high, for example, log collection type applications, such applications may take the form of call oneway , oneway form only send requests without waiting for a response, and send the request to achieve the level of the client operating system is just a cost of system calls, about to write data to the client socket buffer, this process takes usually in microseconds.

2 Consumer

2.1 consumption process idempotent

RocketMQ can not avoid repeating the message (Exactly-Once), so if repeat business is very sensitive to the consumer, be sure to perform the deduplication processing at the operational level. Deduplication can be performed by means of a relational database. First need to determine a unique key message may be msgId may also be the only identification field in the message content, such as order Id like. The only key is present in a relational database prior to consumption judgment. If there is no insert, and consumption, otherwise skip. (Actual process to consider the issue atoms, may attempt to determine whether there is inserted, if the primary key violation message, the insertion fails, skip)

msgId must be globally unique identifier, but the actual use, there may be a case of the same message has two different msgId of (consumer initiative retransmission, because the client rejoin the mechanisms leading to duplication, etc.), this situation needs the business fields repeated consumption.

2.2 Consumption slow approach

1 increase consumption parallelism

Most of the message are all IO intensive consumption behavior, which may be a database operation, or the RPC call, the consumption rate such that a certain consumer behavior or the outer back-end database system, by increasing the consumption of parallelism, can increase the overall consumer throughput, but the degree of parallelism to a certain extent, but will fall. Therefore, the application must set a reasonable degree of parallelism. There are several ways to modify consumption parallelism as follows:

  • Under one ConsumerGroup, by increasing the degree of parallelism to increase the number of instances Consumer (Note that the subscription exceeds the number of queues Consumer instances invalid). By adding machine, or a way to start multiple processes in existing machines.
  • Consumer spending increase individual parallel threads, by modifying the parameters consumeThreadMin, consumeThreadMax achieve.

2 batch mode consumption

If certain business processes supports batch mode of consumption, it is possible to greatly improve the throughput of consumption, such as an order charge type applications, a time-consuming process an order 1 s, once processed 10 orders could only takes 2 s, such consumption can greatly improve the throughput, by providing the consumer consumeMessageBatchMaxSize return parameters, the default is 1, i.e. only once a message consumer, for example set to N, the number of messages per consumption is less than equal to N.

3 non-critical message is skipped

Message accumulation occurs, if the consumer has to catch up with the speed of transmission speed, if less demanding business data, you can choose to discard unimportant messages. For example, when a number of the message queue is deposited above 100,000, then the attempt to discard some or all of the message, so you can quickly catch up with the speed of sending messages. Sample code is as follows:

    public ConsumeConcurrentlyStatus consumeMessage(
            List<MessageExt> msgs,
            ConsumeConcurrentlyContext context) {
        long offset = msgs.get(0).getQueueOffset();
        String maxOffset =
                msgs.get(0).getProperty(Message.PROPERTY_MAX_OFFSET);
        long diff = Long.parseLong(maxOffset) - offset;
        if (diff > 100000) {
            // TODO 消息堆积情况的特殊处理
            return ConsumeConcurrentlyStatus.CONSUME_SUCCESS;
        }
        // TODO 正常消费过程
        return ConsumeConcurrentlyStatus.CONSUME_SUCCESS;
    }    

4 optimization process each message consumer

By way of example as follows, the process of consumption of a message as follows:

  • The message data from the DB [1]
  • The message data from the DB [2]
  • Complex business computing
  • DB to insert data [3]
  • [4] to insert data DB

Consumption process this message has to interact with the DB 4 times, if calculated in accordance with each 5ms, then took a total of 20ms, consuming calculations assume business 5ms, then the total is too time-consuming 25ms, so if we can interact with DB 4 times optimized for two times, then the total time can be optimized to 15ms, that is, overall performance improved by 40%. So if the delay-sensitive applications, it can be deployed in the DB SSD hard disk, compared to SCSI disks, the former RT will be much smaller.

2.3 Consumer Print Log

If the amount is less news, it is recommended to print a message in the consumer entry method, consumer and other time-consuming, to facilitate follow-up troubleshooting.

   public ConsumeConcurrentlyStatus consumeMessage(
            List<MessageExt> msgs,
            ConsumeConcurrentlyContext context) {
        log.info("RECEIVE_MSG_BEGIN: " + msgs.toString());
        // TODO 正常消费过程
        return ConsumeConcurrentlyStatus.CONSUME_SUCCESS;
    }   

If you can print each message consumer time-consuming, so when the online troubleshooting problems and slow consumption, it will be more convenient.

2.4 Other consumption advisories

1 on consumer and subscriptions

The first thing to note is that different consumer groups can independently of consumption of some topic, and each group has its own consumer spending offset, ensure that each consumer in the same group subscription information remains consistent.

2 About ordered message

Consumers will lock each message queue, to ensure that they are individually consumption, although this will result in performance degradation, but when you care about the message sequence would be useful. We do not recommend throwing an exception, you can return ConsumeOrderlyStatus.SUSPEND_CURRENT_QUEUE_A_MOMENT instead.

About 3 concurrent consumption

As the name suggests, consumers will be concurrent consumption of these messages, it is recommended that you use it to get a good performance, we do not recommend throwing an exception, you can return ConsumeConcurrentlyStatus.RECONSUME_LATER instead.

4 on consumption state Consume Status

For the concurrent consumption listener, you can return RECONSUME_LATER to inform consumers now can not consume this news and hope to re-consume it later. Then, you can continue to consume other messages. For ordered the message listener, because you care about its order, so you can not skip the message, but you can return SUSPEND_CURRENT_QUEUE_A_MOMENT tell consumers wait a moment.

5 About Blocking

I do not recommend blocking listener, because it will block the thread pool, and ultimately the consumer may terminate the process

About 6 number of threads

ThreadPoolExecutor consumer consumption in the message inside, so you can change it by setting setConsumeThreadMin or setConsumeThreadMax.

7 on the consumer site

When creating a new consumer group, you need to decide whether to consume the message already exists in the history of CONSUME_FROM_LAST_OFFSET Broker will ignore the message history, and consume any messages generated after. CONSUME_FROM_FIRST_OFFSET consumer information will be present in each of the Broker. You can also use CONSUME_FROM_TIMESTAMP to consume messages after a specified time stamps generated.

3 Broker

3.1 Broker Role

Broker role into ASYNC_MASTER (asynchronous host), SYNC_MASTER (synchronous master) and SLAVE (slave). If the message reliability requirements more stringent, it can be used SYNC_MASTER plus SLAVE of deployment. If less demanding message reliability can be employed ASYNC_MASTER added SLAVE of deployment. If only convenient test, you can select only ASYNC_MASTER or deployment of only SYNC_MASTER.

3.2 FlushDiskType

SYNC_FLUSH (synchronous refresh) compared to ASYNC_FLUSH (asynchronous processing) will lose a lot of performance, but also more reliable, it is necessary to make trade-offs based on actual business scenarios.

3.3 Broker Configuration

parameter name Defaults Explanation
listenPort 10911 Listening port to accept client connections
Nmesrwaddra null nameServer address
brokerIP1 InetAddress card The current broker listening IP
brokerIP2 Like brokerIP1 When the broker from the master exists, if the configured broker brokerIP2 property on the primary node, the connection broker node from the master node configured to synchronize brokerIP2
brokerName null broker's name
brokerClusterName DefaultCluster Cluser name of the broker belongs
brokerId 0 broker id, 0 represents the master, the other positive integer slave
storePathCommitLog $HOME/store/commitlog/ Commit log storage path
storePathConsumerQueue $HOME/store/consumequeue/ Storage consume queue path
mappedFileSizeCommitLog 1024 * 1024 * 1024(1G) commit log mapping file size
deleteWhen 04 Delete the file has exceeded the retention time of the commit log at what time of the day
fileReservedTime 72 In hours of document retention time
brokerRole ASYNC_MASTER SYNC_MASTER/ASYNC_MASTER/SLAVE
flushDiskType ASYNC_FLUSH broker under SYNC_FLUSH / ASYNC_FLUSH SYNC_FLUSH before receiving acknowledgment mode to ensure the message producer brush disc. ASYNC_FLUSH broker in the mode set by using a brush disc message patterns, can achieve better performance.

4 NameServer

RocketMQ in, Name Servers are designed for simple routing management. Its responsibilities include:

  • Brokers regular data to each name server registration route.
  • Name server for the client, including producers, consumers and the command line client to provide the latest routing information.

5 client configuration

RocketMQ relative to the Broker cluster, producers and consumers are clients. This section describes the behavior of producers and consumers of public configuration.

5.1 Client Addressing

RocketMQ can make the client find the Name Server, and then find the Broker by Name Server. There are a variety of configurations as shown below, from high to low priority, high-priority overrides lower priority.

  • Name Server address specified in the code by a semicolon divided among a plurality of addresses namesrv 
producer.setNamesrvAddr("192.168.0.1:9876;192.168.0.2:9876");  

consumer.setNamesrvAddr("192.168.0.1:9876;192.168.0.2:9876");
  • Java startup parameters specified in the Name Server Address
-Drocketmq.namesrv.addr=192.168.0.1:9876;192.168.0.2:9876  
  • Name Server environment variable to specify the address
export   NAMESRV_ADDR=192.168.0.1:9876;192.168.0.2:9876   
  • Static addressing HTTP server (default)

After the client starts, you will regularly visit a static HTTP server at the following address: http://jmenv.tbsite.net:8080/rocketmq/nsaddr , returns this URL as follows:

192.168.0.1:9876;192.168.0.2:9876   

The client default once every 2 minutes to access the HTTP server, and update the local Name Server address. URL has been hard-coded in the code, to access the server may be changed, for example in the / etc / hosts follows increased by modifying / etc / hosts file:

10.232.22.67    jmenv.taobao.net   

HTTP server is recommended to use static addressing, the benefits are simple client deployment, and Name Server cluster upgrade may be hot.

5.2 Client Configuration

DefaultMQProducer, TransactionMQProducer, DefaultMQPushConsumer, DefaultMQPullConsumer are inherited from ClientConfig class, ClientConfig common configuration for the client class. Configure the client are get, set form each parameter can be configured with a spring, may be arranged in the code, e.g. namesrvAddr this parameter can be configured, producer.setNamesrvAddr ( "192.168.0.1:9876"), the other parameters empathy.

1 client common configuration

parameter name Defaults Explanation
Nmesrwaddra   Name Server address list, multiple addresses separated by semicolons NameServer
clientIP Local IP The IP address of the client, some machines may occur does not recognize the client IP address, the application needs to enforce the code
instanceName DEFAULT Examples of client name, client creates a plurality Producer, Consumer is actually a common internal instance (in this example comprises a network connection, the thread resources, etc.)
clientCallbackExecutorThreads 4 Communication layer asynchronous callback number of threads
pollNameServerInteval 30000 Name Server polling interval, in milliseconds
heartbeatBrokerInterval 30000 Sending heartbeats to the Broker, in milliseconds
persistConsumerOffsetInterval 5000 Persistence Consumer consumption schedule interval time, in milliseconds

2 Producer Configuration

parameter name Defaults Explanation
producerGroup DEFAULT_PRODUCER Producer group name, if part of a more Producer application, send the same message, then they should be classified as the same group
createTopicKey TBW102 When sending a message, the server automatically creates a topic does not exist, the need to specify the Key, Key which can be used for sending default route where the topic of the message.
defaultTopicQueueNums 4 When sending a message, the server automatically creates a topic does not exist, the default number of queues created
sendMsgTimeout 10000 Sending a message timeout in milliseconds
compressMsgBodyOverHowmuch 4096 Message Body begins to compress much more than (Consumer will be decompressed automatically receive the message), in bytes
retryAnotherBrokerWhenNotStoreOK FALSE If the return message is sent sendResult, but sendStatus! = SEND_OK, whether to retry sending
retryTimesWhenSendFailed 2 If the message fails, the maximum number of retries, the parameter acts only for the synchronous transmission mode
maxMessageSize 4nb The client message size limit over error, and the server will be limited, it is necessary to cooperate with the service end use.
transactionCheckListener   Check back transaction message listener, if sending transaction message, you must set
checkThreadPoolMinSize 1 Producer Broker check back when the state of affairs, the minimum number of thread pool threads
checkThreadPoolMaxSize 1 Producer Broker check back when the state of affairs, the thread pool maximum number of threads
checkRequestHoldMax 2000 Broker回查Producer事务状态时,Producer本地缓冲请求队列大小
RPCHook null 该参数是在Producer创建时传入的,包含消息发送前的预处理和消息响应后的处理两个接口,用户可以在第一个接口中做一些安全控制或者其他操作。

3 PushConsumer配置

参数名 默认值 说明
consumerGroup DEFAULT_CONSUMER Consumer组名,多个Consumer如果属于一个应用,订阅同样的消息,且消费逻辑一致,则应该将它们归为同一组
messageModel CLUSTERING 消费模型支持集群消费和广播消费两种
consumeFromWhere CONSUME_FROM_LAST_OFFSET Consumer启动后,默认从上次消费的位置开始消费,这包含两种情况:一种是上次消费的位置未过期,则消费从上次中止的位置进行;一种是上次消费位置已经过期,则从当前队列第一条消息开始消费
consumeTimestamp 半个小时前 只有当consumeFromWhere值为CONSUME_FROM_TIMESTAMP时才起作用。
allocateMessageQueueStrategy AllocateMessageQueueAveragely Rebalance算法实现策略
subscription   订阅关系
messageListener   消息监听器
offsetStore   消费进度存储
consumeThreadMin 10 消费线程池最小线程数
consumeThreadMax 20 消费线程池最大线程数
consumeConcurrentlyMaxSpan 2000 单队列并行消费允许的最大跨度
pullThresholdForQueue 1000 拉消息本地队列缓存消息最大数
pullInterval 0 拉消息间隔,由于是长轮询,所以为0,但是如果应用为了流控,也可以设置大于0的值,单位毫秒
consumeMessageBatchMaxSize 1 批量消费,一次消费多少条消息
pullBatchSize 32 批量拉消息,一次最多拉多少条

4 PullConsumer配置

参数名 默认值 说明
consumerGroup DEFAULT_CONSUMER Consumer组名,多个Consumer如果属于一个应用,订阅同样的消息,且消费逻辑一致,则应该将它们归为同一组
brokerSuspendMaxTimeMillis 20000 长轮询,Consumer拉消息请求在Broker挂起最长时间,单位毫秒
consumerTimeoutMillisWhenSuspend 30000 长轮询,Consumer拉消息请求在Broker挂起超过指定时间,客户端认为超时,单位毫秒
consumerPullTimeoutMillis 10000 非长轮询,拉消息超时时间,单位毫秒
messageModel BROADCASTING 消息支持两种模式:集群消费和广播消费
messageQueueListener   监听队列变化
offsetStore   消费进度存储
registerTopics   注册的topic集合
allocateMessageQueueStrategy AllocateMessageQueueAveragely Rebalance算法实现策略

5 Message数据结构

字段名 默认值 说明
Topic null 必填,消息所属topic的名称
Body null 必填,消息体
Tags null 选填,消息标签,方便服务器过滤使用。目前只支持每个消息设置一个tag
Keys null 选填,代表这条消息的业务关键词,服务器会根据keys创建哈希索引,设置后,可以在Console系统根据Topic、Keys来查询消息,由于是哈希索引,请尽可能保证key唯一,例如订单号,商品Id等。
Flag 0 选填,完全由应用来设置,RocketMQ不做干预
DelayTimeLevel 0 选填,消息延时级别,0表示不延时,大于0会延时特定的时间才会被消费
WaitStoreMsgOK TRUE 选填,表示消息是否在服务器落盘后才返回应答。

6 系统配置

本小节主要介绍系统(JVM/OS)相关的配置。

6.1 JVM选项

​ 推荐使用最新发布的JDK 1.8版本。通过设置相同的Xms和Xmx值来防止JVM调整堆大小以获得更好的性能。简单的JVM配置如下所示: ​
​ ​-server -Xms8g -Xmx8g -Xmn4g ​ ​ ​
如果您不关心RocketMQ Broker的启动时间,还有一种更好的选择,就是通过“预触摸”Java堆以确保在JVM初始化期间每个页面都将被分配。那些不关心启动时间的人可以启用它: ​ -XX:+AlwaysPreTouch
禁用偏置锁定可能会减少JVM暂停, ​ -XX:-UseBiasedLocking
至于垃圾回收,建议使用带JDK 1.8的G1收集器。

-XX:+UseG1GC -XX:G1HeapRegionSize=16m   
-XX:G1ReservePercent=25 
-XX:InitiatingHeapOccupancyPercent=30

​ 这些GC选项看起来有点激进,但事实证明它在我们的生产环境中具有良好的性能。另外不要把-XX:MaxGCPauseMillis的值设置太小,否则JVM将使用一个小的年轻代来实现这个目标,这将导致非常频繁的minor GC,所以建议使用rolling GC日志文件:

-XX:+UseGCLogFileRotation   
-XX:NumberOfGCLogFiles=5 
-XX:GCLogFileSize=30m

如果写入GC文件会增加代理的延迟,可以考虑将GC日志文件重定向到内存文件系统:

-Xloggc:/dev/shm/mq_gc_%p.log123   

6.2 Linux内核参数

​ os.sh脚本在bin文件夹中列出了许多内核参数,可以进行微小的更改然后用于生产用途。下面的参数需要注意,更多细节请参考/proc/sys/vm/*的文档

  • vm.extra_free_kbytes,告诉VM在后台回收(kswapd)启动的阈值与直接回收(通过分配进程)的阈值之间保留额外的可用内存。RocketMQ使用此参数来避免内存分配中的长延迟。(与具体内核版本相关)
  • vm.min_free_kbytes,如果将其设置为低于1024KB,将会巧妙的将系统破坏,并且系统在高负载下容易出现死锁。
  • vm.max_map_count,限制一个进程可能具有的最大内存映射区域数。RocketMQ将使用mmap加载CommitLog和ConsumeQueue,因此建议将为此参数设置较大的值。(agressiveness --> aggressiveness)
  • vm.swappiness,定义内核交换内存页面的积极程度。较高的值会增加攻击性,较低的值会减少交换量。建议将值设置为10来避免交换延迟。
  • File descriptor limits,RocketMQ需要为文件(CommitLog和ConsumeQueue)和网络连接打开文件描述符。我们建议设置文件描述符的值为655350。
  • Disk scheduler,RocketMQ建议使用I/O截止时间调度器,它试图为请求提供有保证的延迟。

Guess you like

Origin www.cnblogs.com/izecsonLee/p/12090865.html