Practice of using MQTT and function calculation to do heat map

Preface

In recent years, in some shopping malls, libraries, airports or port environments, we can often see some robots turning around. They are well-known for guiding customers. Not only that, in fact, some pioneering companies will also use robots to collect the characteristic data of these densely populated areas, and report these characteristic data to perform rapid cleaning and processing, thereby providing meaningful countermeasures or guidance. Commercial transformation such as information (advertising) placement decisions.

One of the main scenarios is the heat map of the statistical area, which is open to a specific system (also considering development for end users) for query processing.

These robots will be delivered on demand at different time periods, and will report when the collected data changes significantly or within a certain fixed period. When data collection changes greatly, reporting tends to be frequent, and the demand for subsequent data cleaning and processing tasks will increase simultaneously.

In this article, we will discuss how to deal with such scenarios more appropriately in terms of technology selection.

Scene characteristics and requirements:

1. Data channel connection capability : With the expansion of the business, the delivery of robots will increase simultaneously with the data channel. There is enough expansion flexibility for the data channel and can be expanded on demand. At the same time, the connection level can support 10W+ level expansion.

2. Concise data cleaning ability : The processing of data is essentially the summary and statistics of the data, and the logical realization is not complicated. For the peak and valley changes of the data itself, it is enough to have the simplest and most effective matching expansion and contraction processing capabilities, and it is not desirable to introduce complex traditional big data-level cumbersome solutions for cleaning.

3. Flexible data access capabilities : The heat map information mentioned here will be considered open to end users for access in the future. The amount of access is dynamically changing, and there will be unpredictable amplitudes with different times, festivals, emergencies, etc. Changes, so flexible access capabilities are required in this business. The business side does not want to achieve it through current limiting, because it will affect the business volume itself.

4. Excellent storage capacity : In this scenario, data writing and reading are concurrently high, and customers want to use NoSQL for storage. The NoSQL type can best support the sorting function. Redis is used in the scheme introduced in this article, and no more analysis is introduced.

Analysis of alternative technical solutions

Data channel connectivity

Self-built Kafka

advantage:

As a general data collection information channel, Kafka has a wide range of applications and diversified access methods. The community is perfect and the cost of learning is low.
Kafka itself is easy to build, and the coordination scheme with downstream big data processing products is mature.

Disadvantages:

Dynamic processing of Kafka's expansion is complicated.
It is necessary to build a stable supporting program for additional processing clusters.
External network traffic management requires additional solutions.
The mainstream solution is to serve as the collection capability of connection applications, and there is no case verification of the scale level for the connection capability of the terminal.

Message queue MQTT solution

advantage:

It supports millions of connections, fulfills the requirements that can cover business development, and leaves enough room for business expansion.
The MQTT protocol is very concise and has advantages in the transmission between the terminal and the service. Support the QoS quality reached by various messages.
Support various client access implementation languages.
The connection status of the client can be observed in real time to facilitate the discovery of abnormal conditions.

Disadvantages:

The practice of processing big data is not as mature as Kafka, and the selection of downstream products is subject to certain restrictions.

Flexible data cleaning ability

Big data solutions (Storm, Spark, Flink, etc.)

advantage:

An open source general solution with a large amount of information and a mature solution.

Disadvantages:

The construction, operation and maintenance are complex, and additional monitoring and recovery methods need to be provided.
Need to learn how to accept various components (the following figure is an example of Storm).
Evaluate resource usage in advance, and it is impossible to scale and shrink accordingly according to the amount of real-time data.

Function calculation scheme

advantage:

Expand and shrink on demand, with a hundred millisecond-level expansion and contraction capability, suitable for the peak and valley changes of the data volume.
There is no need to manage the cleaning environment.
The concept is simple and the learning cost is low.
Refer to the figure below for other advantages:

Disadvantages:

Function computing is the product of various cloud vendors. The requirements must be run on the cloud.

Flexible data access capabilities

Solutions for traditional applications

advantage:

As part of the business, it is embedded in a certain application implementation, the technology is mature, and the learning cost is low.

Disadvantages:

Need to implement self-implementation based on the volume of business requests for resizing and shrinking, or in many cases the evaluation method is used for resource redundancy processing.

API Gateway+Functional Computing Solution

advantage:

Real-time elasticity processing according to the amount of customer requests. Use on demand, don't worry about peak hours, and don't pay for idle.
Automatically comes with professional access to monitor the market.

Disadvantages:

Need a small amount of learning costs.

Summary

In this heat map information collection, selection and access business, you can refer to the solution shown in the figure below for perfect implementation.

Key access steps

Introduction to MQTT to Function Computing

Please refer to the micro message queue MQTT service integration solution of function calculation .

Introduction of API gateway extracting data through function calculation

For details, please refer to API Gateway Function Trigger Example .

Take Node.js as an example:

module.exports.handler = function(event, context, callback) { 
   var event = JSON.parse(event);
   var content = {
     path: event.path,
     method: event.method,
     headers: event.headers,
     queryParameters: event.queryParameters,
     pathParameters: event.pathParameters,
     body: event.body
   // 您可以在这里编写您自己的逻辑。
   // 从Redis提取数据的逻辑  
   }
   var response = {
        isBase64Encoded: false,
        statusCode: '200',
        headers: {
          'x-custom-header': 'header value'
        },
        body: content
      }; 
   callback(null, response)
};

Post-note

In the current DT era, there are many instruments for reporting various pulse data, such as sensors for new energy vehicles, reporting of bus location, unlocking of smart property management, parking management of smart parking lots, sales of unmanned stores, and so on. In various scenarios, the processing of reported data is ubiquitous, and the scenarios mentioned above can be implemented by referencing optimization in the way of MQTT+FC+API Gateway in this solution.

Author: break loose, Ali cloud solutions architect

Original link

This article is the original content of Alibaba Cloud and may not be reproduced without permission.