Nameserver + Broker

        For a message queue, the system consists of many machines, each machine role, ip address not the same, but the information is changing. In this case, if a producer or new consumers to join, how to configure the connection information it? Nameserver existence is to solve these problems by maintaining Nameserver configuration information, status information, other roles through Nameserver synergistic implementation.

Nameserver function

        Nameserver state is really a message queue server, the various components of the cluster through which to understand the global information, at the same time, the role of each machine must regularly report their status information to the Nameserver, overtime is not reported, then think of a unavailable machine fails, the other components of the machine will be removed from the list. Nameserver can deploy multiple, independent of each other, while the other Juesu reported Nameserver their state information to the machine, so as to achieve hot backup. Nameserver itself is stateless, i.e. Nameserver the broker, topic and other status information is not stored in persistent, are reported by the timing of each character is stored in memory (Nameserver support configuration parameters of persistence, generally less than) .

      Nameserver play a role in mq in the dispatch center, brother, producers, consumers reported their status up, and get status information from other roles Nameserver. Very lightweight Nameserver features, while very important but was designed, less code, and almost no disk storage, all functions are done efficiently by memory, rocketmq netty based on the underlying communication done a very good abstraction, making communication simple logic function code clarity. Specific reference:        

 

Cluster storage structure state

        In org.apache.rocketmq.namesrv.routeinfo of RouteInfoManager there are five variables class, cluster status is stored in these five variables.

 

 

 

Broker functionality

        Brokerde is the core rocketmq, most "heavyweight" by its work is done. Comprising receiving message sent producers, consumers handling a request message, the persistence operation message, the HA and the service end message-filtering functions.

Message storing and delivering

        Because the distributed queue, so the data to be stored by the persistent disk by the high reliability requirements of disk storage with the message speed will be very slow? Implement and meet high throughput requirements?

        In fact, sometimes a lot of disk, sometimes a lot faster than you think slower than you think, lies in how to use the proper use of funds, disk speed to match the speed of the network. The current high-performance network disk, sequential write speeds of up to 600mb / s, more than the general speed of the card, which is the local disk is faster than imagined, but the speed of random read and write disk only about 100kb / s, and sequential read and write the performance difference is 600 times! Because such a large difference in the speed, and good system message queue of the message queuing system ordinary orders of magnitude faster.

        For example, linux operating system is divided into "user mode" and "kernel mode", file operations, network directed handover operation requires this form, inevitably data replication, a server transmits content to the native client disk file generally divided into two steps:

  1. read(file,temp,buf,len)
  2. write(socket,tmp_buf,len)

        tmp_buf application is the memory in advance, these two seemingly simple hungry operation, data is actually performed four times replication, are: copying the data from the disk memory to kernel mode, the kernel mode from user mode to copy data to the memory (complete a Read); and from the memory to a network user mode kernel mode memory drive, and finally transmitted from the network drive to the kernel mode memory copy network card (complete write)

        By using mmap embodiment, the memory may be omitted to copy the user state, the speed increase, this mechanism in Java is realized by MapendByteBuffer, rocketmq full use of the characteristics described above, i.e. "zero-copy" technique, message archiving and improve transmission speed of the network.

Message storage structure

        Specific information storage structure rocketmq of what is it? How to try to ensure that the order to write it? Let's look at the overall architecture diagram:

Message flowchart

       

 

        Whether producers or consumers Xianxiang NameServer to report their own information, after NameServer will receive your registration to the registration center, like a request to the server than the client, then the client code must specify the server's IP for it, and the client will be able to automatically receive the client's request, naturally your various routing information to register go, as to how the message is sent to the broker address which I guess should be producers and consumers by topic association, can find the corresponding address of the broker, NameServer will then send messages to the associated broker (where the cluster is one).

FIG stored message flow architecture

      

        rocketmq message store is done by fitting and ConsumeQueue CommitLog, real physical memory message yes CommitLog, ConsumeQueue logical queue messages, similar to the index file database, is stored just physical storage addresses, each per topic message queue has a corresponding ConsumeQueue file, both of which are binary files. Address file in $ {$ storeRoot} \ consumequeue \ $ {topicName} \ $ {queueId} \ $ {fileName}, ConsumeQueue but starting offset stored in the message commitLog the offset, size, and message size of the message Tag HashCode value.

        CommitLog papers physically stored, each CommitLog Broker is present on all of the machine is ConsumeQueue index file sharing, file address $ {user.home} \ store \ $ {commitlog} \ $ {fileName}, in some CommitLog length of the message is not fixed, rocketmq take some mechanisms to try to write CommitLog sequentially, but random read. ConsumeQueue content will be written to disk as persistent storage. Such storage mechanism design advantage is:

  1. CommitLog sequential write, can greatly improve the efficiency of writing
  2. 虽然是随机读,但是利用操作系统的pagecache机制,可以批量的从磁盘读取作为cache存到内存中,加速后续的读取速度       
  3. 为了保证完全的顺序写,需要ConsumeQueue这个中间结构,因为ConsumeQueue里只存了偏移量信息,所以尺寸是有限的,在实际情况中,大部分的ConsumeQueue能够被全部读入到内存,所以中间结构的操作速度很快,可以认为是内存读取的速度。此外为了保证CommitLog和ConsumeQueue的一致性,CommitLog里存储了consume queue、message key、tag等所有信息,即使ConsumeQueue丢失也能通过CommitLog完全恢复过来。

关于消息存储看了一篇写的非常好的文章:地址,推荐大家去学习。

高可用机制

        RocketMq分布式集群是通过master和slave的配合达到高可用的,首先说明下master和slave的区别:在broker配置文件中,参数brokerId的值为0表明这个是broker的master,大于0则表明是broker的slave,同时broker的参数(brokerRole:slave)也会说明这个是master还是slave。master角色支持读和写,slave的角色仅支持读,也就是生产者只能和master角色的broker连接写入消息;consume可以连接master角色的broker,也可以连接slave角色的broker来读取消息。

        在consume的配置文件中,并不需要设置是从 master读还是slave读,当master不可用或者繁忙的时候,消费者会被自动切换到slave读,有了自动切换consumer的机制,当master角色的机器出现故障后,consumer仍然可以从slave读取消息,不影响consumer的程序,这就达到了消费端的高可用。

        如何达到消费端的高可用?在创建topic的时候会把多个消息队列的创建放在多个broker组上(相同的broker名称(brokerName相同表明是同一个broker),不同的brokerId的机器组成的一个broker组),这样当一个broker组的master不可用之后,其它组的master仍然可用,生产者仍然可以发送消息,rocketmq目前尚不支持类似于redis的选举,也就是自动把slave转成master,如果机器资源不足,需要把slave转成master,则要手动停止slave角色的broker,更改其配置文件并重启。

同步刷盘和异步刷盘

        RocketMQ  的消息是存在磁盘伤的,这样既能保证断电恢复,又可以让消息存储超出内存的限制。RocketMQ为了提高性能会尽可能的保证顺序写。消息在通过生产者发送到RocketMQ的时候,有两种写磁盘的方式。

  • 异步刷盘方式:在返回写成功状态时,消息可能只是被写入了内存的pagecache ,  写操作返回的快,吞吐量大;当内存的消息积累到一定程度的时候统一触发写磁盘操作,快速写入。
  • 同步刷盘方式:写返回成功状态时,消息已经被写入磁盘。具体流程是消息写入内存的pagecache后,立即通知刷盘线程刷盘,然后等待刷盘完成,刷盘线程执行完成后唤醒等待的线程,返回消息写成功的状态

同步刷盘还是异常刷盘是通过配置文件的flushDiskType 参数设置的,这个参数被设置为SYNC_FLUSH  ASYNC_FLUSH的一个。

同步复制和异步复制

        如果一个broker组有master和slave,消息需要从master复制到slave上,有同步复制和异步复制两种方式。同步复制是等master和slave均写成功后才反馈给客户端成功状态;异步复制方式是要master写成功即可反馈给客户端写成功状态。

        这两种复制方式各有优劣,在异步复制的方式下,系统拥有较低的延迟和较高的吞吐量,但是如果master出现了故障,有数据因为没写入slave,有可能会丢失;在同步复制的情况下,然后master 出现故障,slave上有全部的备份数据,容易恢复,但是同步复制数据会增大数据的写入延迟,降低系统的吞吐量。

        同步复制和异步复制是通过broker配置文件里的brokerRole  参数进行设置的,这个参数可以被设置为ASYNC_MASTER、SYNC_MASTER、SLAVE(备份机)三种的一个。实际应用中要结合业务场景,合理的设置刷盘方式和复制方式,尤其是刷盘方式SYNC_FLUSH方式,由于频繁的触发磁盘写操作,会明显降低性能主从之间配置成SYNC_MASTER的复制方式可以保证消息不丢失,即使一台机器故障数据仍然可以被消费。

消息发送的结果         

 

  1. SEND_OK:消息发送成功
  2. FLUSH_DISK_TIMEOUT: 如果Broker设置MessageStoreConfig的FlushDiskType = SYNC_FLUSH(默认为ASYNC_FLUSH),并且Broker没有在MessageStoreConfig的syncFlushTimeout(默认为5秒)内完成刷新磁盘,则会获得此状态。这里指的是同步刷盘会出现这个情况(异步刷盘写入内存就返回成功)消息虽然已经进入到master的broker服务器,但是若服务器宕机消息会丢失。

  3. FLUSH_SLAVE_TIMEOUT:如果Broker的角色是SYNC_MASTER(默认为ASYNC_MASTER),并且从属Broker未在MessageStoreConfig的syncFlushTimeout(默认为5秒)内完成与主服务器的同步(同步复制),则会获得此状态。这里指的是同步复制会出现上述情况,若此时主服务器宕机消息也会丢失。

  4. 如果Broker的角色是SYNC_MASTER 同步复制(默认为ASYNC_MASTER),但没有配置slave Broker或者slave broker 不可用,您将获得此状态,若此是服务器宕机,消息会丢失。(主备切换失效)

        此时,一般生产上的处理是只要消息发送没出现异常,这几种状态一般是不做处理,因为消息已经发送成功了,至于有没有被同步到slave或者持久化到磁盘,我们都不用太关心,如果你要对消息状态做处理,比如消息发送只要不等于OK则你认为消息发送失败,则你会再一次发送同样的消息,但实际则条消息是存在的,如果你重复发送,你消费端做好幂等就行。

 

 

 

 

 

 

 

发布了73 篇原创文章 · 获赞 18 · 访问量 1万+

Guess you like

Origin blog.csdn.net/qq_40826106/article/details/103934930