eKuiper 1.5.0 Released: Realizing Seamless Industrial Data Collection + Edge Stream Processing

The eKuiper project fills the gap of real-time computing at the edge, and has been more and more widely used in the fields of Industrial Internet of Things and Internet of Vehicles. According to a large number of user feedback collected from GitHub, WeChat groups, forums and other channels, we have continued to improve the usability and reliability of eKuiper, and recently officially released version 1.5.0.

The main highlights of this update are:

  • SQL improvements: eKuiper's core function SQL for defining data flow and analysis rules provides more built-in functions including change monitoring functions and object_construct functions, improving expressiveness;
  • Ecological connection: Provides built-in Neuron connection support, which can easily process the data in the Neuron ecosystem; at the same time, the general SQL plug-in can be connected to a variety of traditional SQL databases, realizing the stream processing of batch data to a certain extent.
  • Operation and maintenance and documentation improvements: The stability of rule runtime is improved, and on-demand compilation is supported. The navigation structure of the document is reconstructed, and the reading experience and query effect are improved.

Community website URL: https://ekuiper.org/zh

GitHub repository: https://github.com/lf-edge/ekuiper

Docker image address: https://hub.docker.com/r/lfedge/ekuiper

ecological connection

Various sources/sinks are ways for eKuiper to connect with the data processing ecosystem. In the new version, eKuiper has added more connection types Neuron and SQL. At the same time, we also improved the original connection, such as adding support for super table in TDEngine sink.

Neuron integration

Neuron  is an EMQ-initiated and open-sourced Industrial Internet of Things (IIoT) edge industrial protocol gateway software for modern big data technologies to harness the power of Industry 4.0. It supports one-stop access to multiple industrial protocols and converts them to the standard MQTT protocol to access Industrial IoT platforms. The integrated use of Neuron and eKuiper can facilitate IIoT edge data collection and calculation.

In previous versions, MQTT was required as a transit between Neuron and eKuiper. When the two work together, an additional MQTT Broker needs to be deployed. At the same time, users need to handle the data format by themselves, including decoding and encoding when reading in and out. Neruon version 2.0 and eKuiper version 1.5.0 will be seamlessly integrated. Users can access the data collected in Neruon in eKuiper without configuration and perform calculations; they can also easily control Neuron from eKuiper. The integration of the two products can significantly reduce the deployment cost of edge computing solutions and simplify the threshold for use. Using the NNG protocol and inter-process communication can also significantly reduce network communication consumption and improve performance.

When users access Neuron, they only need to create a stream of type Neuron in eKuiper:

CREATE STREAM demo() WITH (TYPE="neuron",SHARED="TRUE")

When counter-controlling Neuron, you need to add Neuron action to the action of the rule, and specify the group name, node name and tag name to be written (all of which are dynamic attributes). eKuiper will automatically convert the format to fit Neuron's input format.

"neuron": {
  "nodeName": "{{.node}}",
  "groupName": "grp",
  "tags": [
    "tag0"
  ]
}

For details, please refer to the documentation Neuron Source and Neuron Sink .  

SQL pull and write

The SQL pull source provides a way to convert batch data into streaming data, which enables eKuiper to support the preliminary integrated processing of streams and batches.

In the process of upgrading and transforming the old system, we often need to consider the compatibility with the original system. A large number of legacy systems use traditional relational databases to store collected data. In the new system, there may also be data stored in the database, which is inconvenient to provide streaming access data but needs to be calculated in real time. There are many more scenarios that require access to a huge variety of SQL-enabled databases or other external systems.

eKuiper provides a unified/multi-database universal SQL pull source, which can regularly pull data from data sources that support SQL, and provides basic deduplication capabilities to form streaming data for unified streaming computing processing. The precompiled version of the plug-in supports access to common databases such as MySQL and PostgresSQL. At the same time, the plug-in is equipped with connection capabilities of almost all common databases. Users only need to provide the parameters of the database they need to support when compiling, and then they can compile and support by themselves. Plugins for custom database types.

In addition to data pulling, we also provide a generic SQL plugin for data writing. It is worth noting that eKuiper itself has provided special plugins for time series databases such as InfluxDB and TDengine. The general SQL plug-in can also support connection to these databases, but it provides the insert function and does not support the non-standard concepts of specific databases. For example, the super table of TDengine can only be written using the TDengine plug-in.

For more information and a list of supported databases, see the SQL source plugin and SQL sink plugin documentation .  

eKuiper SQL Improvements

Built-in functions are the main organizational form for SQL to complete various calculations, and are also an important source of SQL expressiveness. SQL improvements in the new version are mainly achieved by adding new functions.

Change monitoring function

Three general change detection related functions have been added in the new version: CHANGED_COLS, CHANGED_COL and HAD_CHANGED.

The role of the CHANGED_COLS function is to detect whether the specified column has changed. If it has changed, it will return the value of the changed column, otherwise it will not return. In change detection scenarios, users often need to monitor multiple columns/expressions, and the number is not fixed. Therefore, the function can receive an indeterminate number of arguments, and its return value can be multiple columns. Compared with ordinary scalar functions that return a single result column (multi-column results will be included in the map), this is the first function that returns multiple columns. We have refactored the implementation of the function to realize the multi-column function. support. The number of parameters of the function is variable, and the parameters of the column can also be other expressions. The column parameter also supports the * sign, which means to detect all columns, such as SELECT CHANGED*COLS ("c*", true,*) FROM demo.

Multicolumn functions can only be used in the Select clause, and their selected values ​​cannot be used in WHERE or other clauses. If you need to filter according to the changed value, you can use the CHANGED_COL function to obtain the changed value as the filter condition; or use the HAD_CHANGED function to obtain the change status of multiple columns as the filter condition. For details and usage examples, please refer to the documentation .

Group selected columns

All columns selected by select in the regular SQL statement will form an object for processing by the sink plug-in and downstream applications. In some scenarios, downstream applications need to group selected columns and then flexibly process each grouping. For example, the selected results are divided into multiple key/value sets, where the key is the file name, so that the results can be dynamically written to multiple files.

The new built-in method object_construct makes it easy to group and name columns. Its syntax is object_construct (key1, col, ...), supports multiple parameters, and returns an object constructed from the parameters. The parameter is a series of key-value pairs, so it must be an even number. The key must be of type string, and the value can be of any type. For example, the user needs to write columns 1, 2, 3 to file 1; and columns 4, 5 to file 2. A SQL rule can then be used to group the columns and assign a value to the group name:

SELECT object_construct("key1", col1, "key2", col2, "key3", col3) AS file1, object_construct("key4", col4, "key5", col5) AS file2 FROM demoStream

Its output is in the form of the following JSON object:

{
"file1":{"key1":"aValue","key2":23,"key3":2.5},
"file2":{"key4":"bValue","key5":90}
}

Easier to operate

The main improvements of the new version in terms of operation and maintenance include improving the stability of the runtime, and providing convenient compilation parameters to facilitate users to tailor software functions according to their needs to adapt to devices with smaller computing power.

Rule isolation

In the new version, we have optimized and refactored the rule operation and life cycle to increase the stability of rule operation and improve the isolation between rules. Mainly in the following aspects:

Rule Error Isolation: Even for rules that use a shared source, a runtime error in one rule will not affect other related rules. At the same time, panic errors at the system level of the new version of the rules will also be handled at the rule level, and will no longer cause the entire eKuiper process to crash.

Rule load isolation: Between sibling rules using shared or memory sources, message inflow throughput is not affected by other rules while maintaining message order.

compile on demand

As an edge stream processing engine, there are many heterogeneous target systems that need to be deployed, including edge computer rooms and gateways with better computing power, as well as cheaper or customized ones due to cost and special business requirements. Software and hardware solutions. With the gradual enhancement of functions, the full-featured eKuiper may be slightly bloated on extremely resource-constrained devices, such as terminals with less than 50MB of memory. In the new version, we strip the core functions and other functions of eKuiper through the compile tags of the go language. When users use it, they can compile some functions as needed by setting the compilation parameters, so as to obtain a smaller running file. For example, to compile only core functions, use make build_core to get a run file containing only core functions. For further information, see Compiling on Demand . 

Documentation is easier to use

On the official website ( https://ekuiper.org  ) launched in April, the eKuiper documentation has been refactored in the directory structure and compiled into the documentation website. The new documentation website has added modules such as concept introductions and tutorials, and adjusted the navigation tree, hoping to help users find useful information more easily.

Upgrade Instructions

The version iteration of eKuiper will try to keep the old and new versions compatible, and the new version is no exception. After upgrading to version 1.5.0, most functions can be upgraded smoothly without changes, but there are two changes that need to be changed manually by the user:

  1. The server configuration item of Mqtt source is changed from servers to server, and the configuration value is changed from array to string. Changes are required in the user's configuration file etc/mqtt_source.yaml. If you use environment variables, such as Docker startup and docker compose file startup, you need to change the environment variables: MQTT_SOURCE DEFAULT SERVERS => MQTT_SOURCE DEFAULT SERVER . The command launched by Docker is changed to docker run -p 9081:9081 -d --name ekuiper MQTT_SOURCE DEFAULT SERVER="$MQTT_BROKER_ADDRESS" lfedge/ekuiper:$tag.
  2. If using Tdengine sink, its attribute name ip is changed to host, and the attribute value must be the domain name.

Guess you like

Origin www.oschina.net/news/197906/ekuiper-1-5-0-released