On how to use SQL table functions for data analysis and acquisition

What is a table function?

Table Function returns a collection of data rows for each input row, that is, a two-dimensional table. 

The returned data set may be empty and may contain one or more rows of data, each row of data containing one or more columns. The return value of an ordinary scalar function is a scalar value.

Why do we need table functions?

Table functions play an important role in the following two aspects:

Data analysis

Data acquisition

In addition to the built-in table functions, users can customize table functions according to their needs to improve data analysis efficiency.

How to use table functions in Honghu

In Honghu, table functions can be used wherever a two-dimensional table structure is required. Table functions are divided into SQL table functions and non-SQL table functions:

SQL table functions

SQL table function can be regarded as a parameterized view (Parameterized View) and supports user-defined functions (User Defined Table Function, UDTF).

Create SQL table function:

  • get_events_from_dataset is the name of the created SQL table function.

  • @data_set table, @key string correspond to the named_parameter in the syntax, indicating that this table function has two parameters, the type of parameter data_set is table, and the type of parameter key is string.

  • SELECT * FROM @data_set WHERE 

    CONTAINS(@key) is the query expression corresponding to the SQL table function.

View SQL table functions

Using SQL table functions

After defining the above SQL table function named get_events_from_dataset we can use it like this:

This query is equivalent to querying:

Delete SQL table function

When deleting a table function definition, you need to specify the function signature (signature), that is, the parameter type must exactly match the defined parameter type.

Non-SQL table functions

Built-in non-SQL table functions

Honghu provides a series of built-in non-SQL table functions to enrich the content of queries. Built-in non-SQL table functions are mainly implemented in C++ and Python.

Scenario: Generate 5 integers from 1 to 5, one integer per line.

Scenario: Parse fields of type json

For more instructions on how to use the built-in non-SQL table functions, please refer to the user manual.

Custom non-SQL table functions

Currently Honghu only supports user-defined Python functions (Python User Defined Table Function, Python UDTF). To customize Python table functions, see Python UDTF development.

Best practices for table functions

In Honghu, the use of table functions needs to be combined with actual data analysis requirements. Usually when we encounter some requirements that cannot be met through general SQL queries, we can follow the following steps to consider whether the requirements can be realized through table functions:

  • Is it possible to achieve this through custom SQL table functions?

  • Is there any built-in table function that can be used directly?

  • Can I use the SDK provided by Honghu to implement table functions with my own unique needs?

Other references

Table Functions 

Guess you like

Origin blog.csdn.net/Yhpdata888/article/details/130771709#comments_26575822