Protocol System Reference mqtt

Author: arctic
link: https: //
Source: know almost
copyrighted by the author. Commercial reprint please contact the author authorized, non-commercial reprint please indicate the source.

Review your work experience most regret is not implemented in code designed to leave the system in a hurry! The main purpose of writing this article is to share ideas about the realization of communication services, designed to facilitate their communication services, also hopes to share practical knowledge inadequate design. The company is working to make electric vehicle charging can be said to be a very great things project, a EVCS system (Electric vehicle charging system), including APP, cloud platform, charging pile, electric cars and other parts. Among the many cloud platform communication service is a service charge of access gateways and embedded with back-end business services coordinated middleware. Today, mainly based on their own experiences to share about the implementation details of communications services, including the practice also has some thinking to do for the system defects. In this article is not only limited to the electric vehicle charging system to charge an electric vehicle systems, for example, may be designed as a reference system based mqtt protocol.

Term Description

Embedded Gateway: It generally consists of four parts embedded microprocessors, peripheral hardware devices, embedded operating system and user applications and so on. Charge of relay switches in the present system with a server and a communication network.

Charging device (charging pile): for electric vehicle charging equipment is connected via a charging gun and car, which contains an embedded gateway.

comm: a broker needs to expand our program to achieve, communication short.

Communication services: software services and embedded gateway is responsible for communications, composed by the broker and comm.

demand analysis

Development of M2M communications services face communication protocol between communication services and embedded gateway: the first requirement

Communication is very common for things and the key, whether it is short-range wireless transmission technology or mobile communication technology, are affecting the development of things. In the communication, the communication protocol is particularly important, refers to both the entity to complete the rules and conventions of communication or service that must be followed, things commonly used communication protocols: MQTT, DDS, AMQP, XMPP, JMS, REST, CoAP these types of protocols have been widely used, and each protocol has at least 10 kinds of code implementation, have declared support for real-time publish / subscribe protocol of things, but the specific things the system architecture design, consider the actual scene of communication needs, select the appropriate protocol.

The second demand: in front of thousands of highly available link socket is placed in front of the problem

Charging industry prospects how much of the charging post to meet the needs of the market is difficult to predict a problem, we should be designed from the beginning to ensure communication service can be extended levels, from the government point of view of charging project is a livelihood project can not be every day the problem ah! A highly available communication service is the base of things like electric vehicle charging system.

The third demand: data encryption transmission

Data security into the final discussion, the way this part of the implementation of the system related.

The fourth demand: real-time control, real-time monitoring

Electric vehicle charging system is a real-time interactive system, as users browse the web long wait is unbearable. In addition to the business processing time, transmission time should be as short as possible. A message sent through the actual test from the embedded gateway server to communicate time of around 200ms. (Using a 3G router)

The fifth demand: communication service upgrades do not affect long large-scale users

Communication service upgrades or downtime will affect the use of the system, how to quickly find and focus on service recovery is the design of the system should be concerned. Assuming that the above requirements we have been resolved, what did not think of it? This time testers stand out spoken, so many devices attached to a communication service is assumed that the communication service upgrades in question affects a lot of equipment, but ah! (Because it was publishing tasks are testers, each release in the middle of the night and also a lot of times rollback, use one word to describe the release is treading on thin ice). It is for this pain point is not that we should implement the gray publish it? After the service is upgraded so that a small number of devices connected to a server upgrade, and so there is no problem and then confirm a comprehensive upgrade.

The sixth demand: Do not let the avalanche phenomenon

Avalanche phenomenon is due to a service hang up or down eventually lead to the caller abnormalities cause the entire system into an unusable state. In the detailed design of communication services, I will focus on that communication services is how to prevent the occurrence of an avalanche phenomenon.

For the above demand I made the following communication services design, broker and comm deployed on the same server and is a one to one relationship.


In the above image communication need programmers work includes the listener and two comm program portion, which is the core business services is how a charging system for metering and billing logic, business services is very complex but the angle of a communication service See primarily used to process data and upload the embedded gateway control commands issued. Of course, different treatment of different business systems business, but you can guarantee that communications service to provide them with basic data.

Ado! Here to share with you how and why design communication service is implemented. By comparison I chose the mqtt protocol as the communication protocol of the Internet of Things system architecture design of the system is determined after the design of the protocol. Select the mqtt choose which broker is facing problems and achieve broker mqtt protocol as shown below, and I had a simple usage statistics in the group.


mosquitto only provides a way of bridging not recommended to open and persistence, because IO will reduce the performance of the broker. If you turn off log This leads to a problem not to be investigated, although an amount up but the overall feeling is not strong enough. emqttd does support the needs of the people and is distributed deployment of realization of Chinese products is very rich development documents, but can not meet the needs of the gray release of flexible, open source due to the current broker can not all meet the actual demand, so there will be above system architecture.

Communications service implementation used in communication.

one-way: the sender does not need to wait for the data to send out the receiver returns ACK;
Request-Response: the sender and receiver in a synchronized manner call;
TWO-Way: the sender transmits data in a predetermined out of the receiver will return a ack but the whole process is asynchronous in time.

detailed design:

嵌入式网关在连接通信服务时(包括重连)首先以同步的方式向监听器请求获取URL地址然后再去和具体的broker通信,连接到broker后嵌入式网关不再和监听器通信。<1.获取IP地址>的过程我们可以采用http rest方式,这里可以借鉴httpDNS 的思路 <2.数据传输>的过程采用的是mqtt协议。监听器通过设备ID、协议version和设备重要程度动态分配通信服务器IP给设备端,设备通过分配的地址和broker建立长链接。下面列举一下设备向监听器发送的请求、响应格式:
















上图中我们可以看到监听器和comm之间是有心跳的,如果comm挂掉监听器需要设置该地址的服务为不可用状态直到comm恢复才可以分配broker的地址给嵌入式网关使用。comm在启动的时候需要向监听器注册自己的地址信息,注册成功后监听器以主动请求comm的方式作为心跳,这样可以减少comm的实现复杂度。 心跳的内容可以是连接到本服务的socket数量或者是服务的压力指数,监听器获取这些信息可以实现更好的路由。





要显示嵌入式网关的网络状态,需要嵌入式网关连接到broker时发布一条上线消息表示可以接受数据处于上线状态,当网关主动close 链接时也要发送一条离线消息,如果异常断开mqtt协议提供了遗愿让broker代替嵌入式网关发送离线状态消息。这里需要说明一下,broker发送遗愿的时间是1.5个心跳的周期所以设备每次重连的时间间隔最好大于2个心跳周期,这样可以保证设备上线后broker不会再发送遗愿消息,这样网关的网络状态才能是"上线->离线->上线 ->离线",如果重连时间少于1.5个心跳周期就可能出现 "上线->上线->离线" 导致实际网络状态与平台状态不一致。

消息队列是为了与后端服务解耦。除了缓存嵌入式网关上传的数据,还可以用于后端业务服务下发指令。此时你有没有产生疑问,后端业务服务下发指令到kafka而对应的消费者是多个comm,怎样知道设备连到在哪个broker上需要哪个comm来接收指令呢?前面已经说了设备上线的时候需要发送一条上线消息给broker此时comm程序可以把嵌入式网关ID和一个固定的Topic注册到redis缓存中(每个comm程序都有一个固定唯一用于接收指令的kafka Topic),后端业务服务在发送指令时需要先向redis缓存查询设备ID对应的kafka Topic然后发送到kafka,这样订阅该topic的comm程序就可以接收到消息,并通过broker发送给嵌入式网关。


comm程序的实现依赖broker、redis缓存和kafka消息队列。一个健壮的comm程序应该保证redis服务不可用的时候只会影响到实时数据的更新不会影响到通过kakfa上传的数据,同理kafka服务不可用也不会影响缓存的更新。防止雪崩最简单的思路就是线程池隔离,每个依赖的服务使用一个发送或者接受线程池。被调用方不可用时相应的comm的线程池被阻塞但不会影响到其他线程池正常的工作。具体实现我们可能用Hystrix 。Hystrix 这个神器在这里就不在细说如果希望通信服务在不可用状态恢复正常少不了他。


根据个人经验socket链接在8000+ 时每分钟断开链接重连的次数大约在50次左右,如果其中一个broker 宕机并发请求数量可能增加到几千。因为嵌入式网关有应对监听器宕机的策略和监听器在初始化时就将策略加载到内存,程序运行中很少进行数据库IO所以监听器使用单机基本也可以满足需求,如果不放心可以使用主从方案。


mqtt协议中提到的qos(quality of service) 只是数据到达broker的服务质量不能保证消费者(comm程序)不丢失数据。为了保证数据安全到达后端业务服务需要在设计业务流程时要添加业务层的ACK机制。这个确认由后端业务服务确认如果在规定的时间内没有收到ACK消息嵌入式网关需要重发。











2.每个comm接收指令的kafka topic是唯一的,这样才能保证只有一个comm来接收后端业务发送过来的指令,如果后端业务在缓存中查询不到设备ID对应的topic表示设备在离线状态不能发送指令给嵌入式网关。

3.kafka 消息队列非常灵活,后端业务服务如果使用相同的客户端ID接收消息则只有一个消费者可以请求到其他消费者不会得到重复的消息;如果后端业务采用不同的客户端ID,kafka则是广播的方式每个消费者都会收到相同的消息。



Guess you like