From concept to practice, take you to master hierarchical recursive query

This article is shared from Huawei Cloud Community "GaussDB Database SQL Series - Hierarchical Recursive Query", author: Gauss Squirrel Club Assistant 2.

I. Introduction

Hierarchical recursive query is a common SQL query method, especially often used in some hierarchical data storage structures. This article mainly uses the GaussDB database as the experimental platform to explain its use.

2. GuassDB database hierarchical recursive query concept

The hierarchical structure can be understood as a tree-like data structure, consisting of nodes. As a simple example, as shown in the figure below, query the root node upward from the child nodes, or traverse all child nodes from the root node:

Recursive query refers to a query method that requires calling itself multiple times in the query. In a recursive query, the query will repeatedly recurse into a subquery until the query obtains results that satisfy the conditions or traverses the entire query range. Recursive queries have important applications in the database field. Convenient data processing and simplified development code.

In the GaussDB database, recursive queries can be implemented by using the "select...start with...connect by...prior..." and "WITH RECURSIVE" syntax.

3. GaussDB database hierarchical recursive query experimental example

1. Create an experiment table

--Create experiment table

CREATE TABLE area(

a_code VARCHAR(10)

,a_name VARCHAR(10)

,p_a_code VARCHAR(10)

,a_level INT);
--Insert test data

INSERT INTO area VALUES('610000','Shaanxi Province','0','1');

INSERT INTO area VALUES('610100','Xi'an City','610000','2');

INSERT INTO area VALUES('610101','municipal district','610100','3');

INSERT INTO area VALUES('610102','Xincheng District','610100','3');

INSERT INTO area VALUES('610103','Beilin District','610100','3');

INSERT INTO area VALUES('610104','Lianhu District','610100','3');

INSERT INTO area VALUES('610111','Baqiao District','610100','3');< a i=14>
INSERT INTO area VALUES('610112','Weiyang District','610100','3');
INSERT INTO area VALUES('610113','Yanta District','610100','3');< a i=16>
INSERT INTO area VALUES('610114','Yanliang District','610100','3');< a i=17>
INSERT INTO area VALUES('610115','Lintong District','610100','3');
INSERT INTO area VALUES('610116','Chang'an District','610100','3');
INSERT INTO area VALUES('610122','Lantian County','610100','3');
INSERT INTO area VALUES('610124','Zhouzhi County','610100','3');
INSERT INTO area VALUES('610125','Huyi District','610100','3');< a i=22>
INSERT INTO area VALUES('610126','Gaoling District','610100','3');
--View initialization results
SELECT * FROM area;









2、sys_connect_by_path(col, separator)

Description: Returns the connection path from the root node to the current row.

Parameters: col is the column name displayed in the path, supports columns of type CHAR/VARCHAR/NVARCHAR2/TEXT, and the parameter separator is the separator between path nodes.

Return value type: text

Example:

-Returns the connection path from the root node to the current row

SELECT *, sys_connect_by_path(a_name, '-') FROM area start with a_code ='610000' connect by prior a_code = p_a_code;

3、connect_by_root(col)

Description: Returns the root node value of the current row.

Parameters: col is the name of the output column.

Return value type: It is the data type of the specified column col.

Example:

--Returns the root node value of the current row. 

SELECT *, connect_by_root(a_name) FROM area start with a_code ='610000' connect by prior a_code = p_a_code;

4、WITH RECURSIVE

Using the WITH RECURSIVE keyword,:

--使用WITH RECURSIVE

WITH RECURSIVE t_area AS (

SELECT a_level,a_code,p_a_code,a_name, a_name ::varchar(50) AS path FROM area WHERE p_a_code = '0'

UNION ALL

SELECT t2.a_level+1,t1.a_code,t1.p_a_code, t1.a_name,CONCAT(t2.path, ',', t1.a_name) ::varchar(50) AS path FROM area t1 JOIN t_area t2 ON t1.p_a_code=t2.a_code

) SELECT * FROM t_area;

Example description: This query uses recursive expressions to traverse provincial administrative region relationships. The expression uses two SELECT statements: The first SELECT statement selects all administrative area information with parent code 0 and adds them to the temporary table t_area. Their level is chosen to be the initialized a_level, and their path is set to their administrative region name a_name. This SELECT statement is the starting point for the recursive query. The second SELECT statement joins the area table and the t_area table. It selects all administrative areas with a parent in the area table and connects to administrative areas that already exist in the t_area table. For each connected row, their level is the parent's level plus 1, and their path is the parent's path plus a comma and their own borough. The query results return all administrative area information in the t_area table.

("::varchar(50)" means that the character length when creating the experimental table is not enough and needs to be redefined. Secondly, the two SELECT statements are connected using UNION ALL and the type length needs to be kept consistent).

4. Advantages and Disadvantages of Recursive Query

1. Advantages

Recursive queries can simplify application code and facilitate the processing of data structures. In some complex query scenarios, recursive queries can get results faster. Suitable for various types of tree structures.

2. Disadvantages

Recursive queries may sometimes produce many recursive calls, resulting in performance degradation. Algorithms are often more complex than other methods and more difficult to write. Not suitable for processing large data sets.

5. Summary

Recursive query is a very practical query method and is widely used in complex query scenarios such as hierarchical data and tree data. However, there are some issues you need to pay attention to when using recursive queries:

• The depth of recursion must be reasonably controlled to prevent excessive recursion.

• It is best not to perform complex calculations and combination operations in recursive queries to avoid consuming too many resources.

• Avoid using ORDER BY operations in recursive queries, which will greatly reduce performance.

• When using recursive queries, the problem of infinite loops should be handled carefully.

Similarly, when using databases such as GaussDB, as long as recursive queries are applied correctly and reasonably, query efficiency and application performance can be better improved.

Click to follow and learn about Huawei Cloud’s new technologies as soon as possible~

IntelliJ IDEA 2023.3 & JetBrains annual major version update New concept "defensive programming": make yourself a stable job GitHub .com runs more than 1,200 MySQL hosts, how to seamlessly upgrade to 8.0? Stephen Chow’s Web3 team will launch an independent app next month Will Firefox be eliminated? Visual Studio Code 1.85 released, floating window US CISA recommends abandoning C/C++ to eliminate memory security vulnerabilities Yu Chengdong: Huawei Disruptive products will be launched next year and rewrite the history of the industry TIOBE December: C# is expected to become the programming language of the year Lei Jun’s paper written 30 years ago: "Computer Virus Determination" Expert System Principles and Design
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4526289/blog/10320161
Recommended