When using the time series database TDengine for SQL query, these issues need to be paid attention to

Little T's guide : Although time-series data processing is characterized by writing operations as the main and reading operations as supplementary, the query requirements cannot be ignored. For the convenience of users, Time Series Database (Time Series Database) TDengine  uses SQL as the query language. The main query functions include single-column and multi-column data query, four arithmetic operations on numerical columns and aggregated results, and connection query operations for timestamp alignment. Analyze some of the query details.

In "Query Performance: TDengine is up to 37 times that of InfluxDB and 28.6 times that of TimescaleDB" , we learned about the specific strength of TDengine in query. However, in order to make better use of the query performance of  TDengine  , there are still some matters that need to be paid attention to in actual operation. Some of the contents are summarized as follows:

TDengine SQL query statement can specify some or all columns as the returned result. Both data columns and label columns can appear in the list.

Wildcards and label columns

The wildcard * can be used to refer to all columns. For ordinary tables and subtables, there are only ordinary columns in the result. For super tables, a Tag column is also included.

SELECT * FROM d1001;

Wildcards support table name prefixes, and the following two SQL statements both return all columns:

SELECT * FROM d1001;
SELECT d1001.* FROM d1001;

In the JOIN query, there is a difference between the results returned by * with table name prefix and without * prefix, * returns all column data of all tables (excluding labels), while the wildcard with table name prefix returns only the table’s data column data.

SELECT * FROM d1001, d1003 WHERE d1001.ts=d1003.ts;
SELECT d1001.* FROM d1001,d1003 WHERE d1001.ts = d1003.ts;

In the above query statement, the former returns all columns of d1001 and d1003, while the latter only returns all columns of d1001.

When using SQL functions to query, some SQL functions support wildcard operations. The difference is that  count(*)the function returns only one column. first, last, last_rowfunction returns all columns.

In addition, we can also specify a label column in the query of the super table and sub table, and the value of the label column will be returned together with the data of the normal column.

SELECT location, groupid, current FROM d1001 LIMIT 2;

Result deduplication

DISTINCT Keywords can deduplicate one or more columns in the result set, and the removed columns can be either label columns or data columns.

Deduplicate the label column:

SELECT DISTINCT tag_name [, tag_name ...] FROM stb_name;

Deduplication of data columns:

SELECT DISTINCT col_name [, col_name ...] FROM tb_name;

requires attention:

  1. The configuration parameter maxNumOfDistinctRes in the cfg file will limit the number of data rows that DISTINCT can output. The minimum value is 100000, the maximum value is 100000000, and the default value is 10000000. If the actual calculation result exceeds this limit, only the part within this amount will be output.
  2. Due to the natural precision mechanism of floating-point numbers, using DISTINCT on FLOAT and DOUBLE columns does not guarantee the complete uniqueness of the output value under certain circumstances.

special function

Some special query functions can be executed without using the FROM clause.

The following command can get the current database database(). If no default database is specified when logging in and no USEcommand is used to switch data, NULL will be returned.

SELECT DATABASE();

Get server and client version numbers:

SELECT CLIENT_VERSION();
SELECT SERVER_VERSION();

Server status detection statement. Returns a number (eg 1) if the server is healthy. If the server is abnormal, return error code. This SQL syntax is compatible with  the check of the TDengine  status by the connection pool and the check of the database server status by third-party tools. And it can avoid the problem of connection pool connection loss caused by using wrong heartbeat detection SQL statement.

SELECT SERVER_STATUS();

We can use SELECT NOW();to get the current time, use SELECT TODAY();to get the current date, and use SELECT TIMEZONE();to get the current time zone.

regular expression filter

grammar

WHERE (column|tbname) match/MATCH/nmatch/NMATCH _regex_

Regular Expression Specification

Make sure that the regular expressions used conform to the POSIX specifications. For specific specifications, please refer to Regular Expressions

usage restrictions

Not only can regular expression filtering be performed on table names (ie tbname filtering) and binary/nchar type tag values, but also common column filtering is supported.

The length of the regular matching string cannot exceed 128 bytes. You can set and adjust the maximum allowed regular matching string through the parameter maxRegexStringLen, which is a client configuration parameter and needs to be restarted to take effect.

CASE expression

grammar

CASE value WHEN compare_value THEN result [WHEN compare_value THEN result ...] [ELSE result] END
CASE WHEN condition THEN result [WHEN condition THEN result ...] [ELSE result] END

illustrate

TDengine allows users to use IF ... THEN ... ELSE logic in SQL statements through CASE expressions.

The first CASE syntax returns the first result whose value is equal to compare_value, if no compare_value matches, returns the result after ELSE, and returns NULL if there is no ELSE part.

The second syntax returns the result for which the first condition is true. If no condition is met, return the result after ELSE, and return NULL if there is no ELSE part.

The return type of the CASE expression is the result type of the first WHEN THEN part, and the result types of the remaining WHEN THEN parts and ELSE parts need to be converted to it, otherwise TDengine will report an error.

example

A certain device has three status codes, showing its status, the statement is as follows:

SELECT CASE dev_status WHEN 1 THEN 'Running' WHEN 2 THEN 'Warning' WHEN 3 THEN 'Downtime' ELSE 'Unknown' END FROM dev_table;

Count the average voltage of the smart meter. When the voltage is less than 200 or greater than 250, it is considered that the statistics are wrong, and the value is corrected to 220. The statement is as follows:

SELECT AVG(CASE WHEN voltage < 200 or voltage > 250 THEN 220 ELSE voltage END) FROM meters;

JOIN child clause

TDengine  supports the inner join based on the timestamp primary key, that is, the JOIN condition must include the timestamp primary key. As long as the requirement of timestamp-based primary key is met, inner joins between ordinary tables, subtables, supertables, and subqueries can be freely performed, and there is no limit to the number of tables.

JOIN operation between ordinary tables and ordinary tables:

SELECT *
FROM temp_tb_1 t1, pressure_tb_1 t2
WHERE t1.ts = t2.ts

JOIN operation between super table and super table:

SELECT *
FROM temp_stable t1, temp_stable t2
WHERE t1.ts = t2.ts AND t1.deviceid = t2.deviceid AND t1.status=0;

JOIN operation between subtable and supertable:

SELECT *
FROM temp_ctable t1, temp_stable t2
WHERE t1.ts = t2.ts AND t1.deviceid = t2.deviceid AND t1.status=0;

Similarly, JOIN operations can also be performed on the query results of multiple subqueries.

nested query

"Nested query" is also called "subquery", that is, in a SQL statement, the calculation result of "inner query" can be used as the calculation object of "outer query".

Starting from version 2.2.0.0, TDengine's query engine began to support the use of non-associated subqueries in the FROM clause ("non-associated" means that the parameters in the parent query will not be used in the subquery). That is, in the tb_name_list position of the ordinary SELECT statement, an independent SELECT statement is used instead (this SELECT statement is enclosed in English parentheses), so the complete nested query SQL statement is as follows:

SELECT ... FROM (SELECT ... FROM ...) ...;

requires attention:

  • The returned result of the inner query will be used as a "virtual table" for the outer query. This virtual table is suggested to have an alias for easy reference in the outer query.
  • In both inner and outer queries, common inter-table/super-table JOIN is supported. The calculation result of the inner query can also participate in the JOIN operation of the data subtable.
  • The features supported by the inner query are consistent with those of the non-nested query statement.
    • The ORDER BY clause of the inner query is generally meaningless. It is recommended to avoid such writing to avoid unnecessary resource consumption.
  • Compared with non-nested query statements, the functional features supported by the outer query have the following restrictions (calculation function part):
    • If the result data of the inner query does not provide a timestamp, functions whose calculation process implicitly relies on timestamps will not work properly in the outer layer. For example: INTERP, DERIVATIVE, IRATE, LAST_ROW, FIRST, LAST, TWA, STATEDURATION, TAIL, UNIQUE.
    • If the result data of the query in the inner layer is not ordered by timestamp, the functions that depend on the ordering of data in the calculation process will not work properly in the outer layer. For example: LEASTSQUARES, ELAPSED, INTERP, DERIVATIVE, IRATE, TWA, DIFF, STATECOUNT, STATEDURATION, CSUM, MAVG, TAIL, UNIQUE.
    • Functions that require two scans for their calculations do not work properly in outer queries. Example: Such functions include: PERCENTILE.

write at the end

Due to space limitations, this article only explains some rules and precautions when performing SQL queries. For the usage rules of the result set column names, pseudo-columns, query objects, GROUP BY clauses and related syntax examples, you can enter the official website document ——https://docs.taosdata.com/taos-sql/select/#group-by for reference. When using  TDengine  to execute SQL queries, the above practical manual will help you solve a series of basic problems. But if the problem you encounter has not been solved for a long time, don't worry, you can add a small T WeChat (tdengine) to seek help from TDengine technicians.

Guess you like

Origin blog.csdn.net/taos_data/article/details/132180277