More Trees & Hierarchies in SQL

Hierarchies are sometimes difficult to store in SQL tables...things like trees, threaded forums, org charts and the like...and it's usually even harder to retrieve the hierarchy once you do store it. Here's a method that's easy to understand and maintain, and gives you the full hierarchy (or any piece of it) very quickly and easily.

层次结构有时候很难去存储到SQL表中...就像树,threaded forums, org charts and the like...以及一旦被存储起来就很难去取回....这里有一种方法不但容易理解和维护,而且可以让你很快又容易的找回所有层次结构或者任何地方。

While XML handles hierarchical data quite well, relational SQL doesn't. There are several different ways to model a hierarchical structure. The most common and familiar is known as the adjacency model, and it usually works like this:

XML操作层次数据比较好,而关联的SQL则不然。有几种不同的方法可以模拟一个层次结构。最常见和熟悉的是邻型模型,它的工作如下:

 The table would look like this:                       数据库表就像下面这样:

 

EmployeeID Name BossID
1001 Denis Eaton-Hogg NULL
1002 Bobbi Flekman 1001
1003 Ian Faith 1002
1004 David St. Hubbins 1003
1005 Nigel Tufnel 1003
1006 Derek Smalls 1003

And the org chart/indented list looks like this:   组织结构/缩进 如下:

Denis Eaton-Hogg
  Bobbi Flekman
    Ian Faith
      David St. Hubbins
      Nigel Tufnel
      Derek Smalls

It's called the "adjacency" model because the parent (boss) data is stored in the same row as the child (employee) data, in an adjacent column. It's a pretty straightforward design that's easily understood by everyone...no deep relational theory needed. You can find a person's boss easily, and you can find their coworkers by querying the BossID column.

之所以叫“邻接”模型,是它父亲(老板)数据和儿子(员工)数据存在同一行中相邻的列。这是一种非常漂亮又直接了当的设计,也很容易被理解。 不需要很深的关系理论。你可以找到很容易的找到一个人的老板,同样也可以找到你的员工通过查找BossID列。

The trouble begins when you want to list several levels of a hierarchy. To find the boss's boss, you would need to join the Employees table to itself, like this:

SELECT BigBoss.Name BigBoss, Boss.Name Boss, Employees.Name Employee
FROM Employees 
INNER JOIN Employees AS Boss ON Employees.BossID=Boss.EmployeeID
INNER JOIN Employees BigBoss ON Boss.BossID=BigBoss.EmployeeID

 And you'd get the following:

BigBoss Boss Employee
Denis Eaton-Hogg Bobbi Flekman Ian Faith
Bobbi Flekman Ian Faith David St. Hubbins
Bobbi Flekman Ian Faith Nigel Tufnel
Bobbi Flekman Ian Faith Derek Smalls

For each level, you'd need to join the table to itself...not an attractive option if you have 5 or more levels! It would be great if it could join itself as many times as needed. This is called a recursive join, and though some database products support it (Oracle has the CONNECT BY syntax) SQL Server is not one of them.

对于每一级别,你都需要把表自我连接起来. 如果超过5个级别,这就不是有吸引力的选择了。 如果级数越多需要的时间就越长。这就是递归连接,当然有些数据库产品有支持它的语法(Oracle有CONNECT BY语法) SQL Server则没有。

If you look in Books Online under "expanding hierarchies" you'll find a stored procedure that runs through an adjacency table to expand the hierarchy. While it works, it's a procedural method that requires a stack (using a temp table) and can take a while to run with large hierarchies. It also PRINTs out the indented list, so you'd need to modify it to use ANOTHER temp table if you wanted the results as a table/query.

如果你查看“拓展层次”的在线帮助,你会发现一个执行邻接表去展现层次结构的存储过程。。。。

If you've followed Joe Celko's columns or bought his books, he recommends the nested set model for representing trees in SQL (he's posted it on SQL Team a few times). It's very well detailed in the following articles, Part IIIIIIIV, and also in his book, SQL For Smarties, and I recommend checking it out. It's very efficient and makes it extremely easy to pull out trees/subtrees from the table.

如果你....

However (you knew this was coming!) one of the issues I have with nested sets is the complexity required to do relatively simple tasks, like adding, deleting, or moving nodes in the tree. Even finding an employee's immediate supervisor or subordinates (上司或者下属) requires 3 self-joins AND a subquery! Joe admits this shortcoming in his book...and it's interesting that the solution ONLY appears in his book, I've never seen him post it online.

可是....

Although there's a very seductive logic to nested sets, and it's easy to do complicated tree operations with them, I find them less intuitive[直观] than the adjacency model. It's harder for me to visualize a hierarchy or org chart with them. You may be able to use them more easily than I can, but if you also find them daunting【令人生畏】, read on.

尽管这是一个非常...

So how to represent a hierarchy, using adjacency, and avoiding recursion wherever possible? It's pretty easy really...you build it and store it in the table! (I've posted this method in this thread a while back, and I'm elaborating【议定】 on it here)

那么怎样去展现一个层次结构,使用邻接但是避免递归?...

Here's the table definition for the Tree:

CREATE TABLE Tree (
Node int NOT NULL IDENTITY(100, 1),
ParentNode int, 
EmployeeID int NOT NULL,  
Depth tinyint,
Lineage varchar(255) )

 I'm keeping the Tree table separate for a few good reasons I'll discuss later, but you could simply add the Depth and Lineage columns to the Employees table above, and substitute BossID for ParentNode. (I also didn't really WANT to use an identity column, but most people will anyway) The terms "node" and "lineage" might seem unfamiliar, but I wanted to generalize them a little more than "child", "parent" and "hierarchy".

我保持这张Tree表 ...

Based on the Employees table, here's how the Tree will be filled:

Node ParentNode EmployeeID Depth Lineage
100 NULL 1001 NULL NULL
101 100 1002 NULL NULL
102 101 1003 NULL NULL
103 102 1004 NULL NULL
104 102 1005 NULL NULL
105 102 1006 NULL NULL

This will only need to be done once, and afterwards you won't need to maintain the BossID column in the Employees table. The next part is to find the root node of the tree, also known as the top-level, or big boss man, etc. in an org chart. That's the node that has no parent (Null), so we will start there and set the Lineage column as the root:

UPDATE Tree SET Lineage='/', Depth=0 WHERE ParentNode Is Null

 Once that's done, we can then update the rows who are immediate children of the root node:

UPDATE T SET T.depth = P.Depth + 1, 
T.Lineage = P.Lineage + Ltrim(Str(T.ParentNode,6,0)) + '/' 
FROM Tree AS T 
INNER JOIN Tree AS P ON (T.ParentNode=P.Node) 
WHERE P.Depth>=0 
AND P.Lineage Is Not Null 
AND T.Depth Is Null

  In fact, we can just put a loop on this to run through all of the children/grandchildren etc. of the tree:

WHILE EXISTS (SELECT * FROM Tree WHERE Depth Is Null) 
UPDATE T SET T.depth = P.Depth + 1, 
T.Lineage = P.Lineage + Ltrim(Str(T.ParentNode,6,0)) + '/' 
FROM Tree AS T 
INNER JOIN Tree AS P ON (T.ParentNode=P.Node) 
WHERE P.Depth>=0 
AND P.Lineage Is Not Null 
AND T.Depth Is Null

 总结ltrim(x,y) 函数是按照y中的字符一个一个截掉x中的字符,并且是从左边开始执行的,只要遇到y中有的字符, x中的字符都会被截掉, 直到在x的字符中遇到y中没有的字符为止函数命令才结束 .

Don't worry about the loop, it runs once for each level in the hierarchy...10 loops for 10 levels or generations. For a corporation, 10 layers of management is pretty deep; for a family tree, you could trace an American family back to the Revolutionary War! And under normal circumstances, you'd also only have to run this procedure once. The final result is:

不要担心循环....

Node ParentNode EmployeeID Depth Lineage
100 NULL 1001 0 /
101 100 1002 1 /100/
102 101 1003 2 /100/101/
103 102 1004 3 /100/101/102/
104 102 1005 3 /100/101/102/
105 102 1006 3 /100/101/102/

You'll notice that for each node, the entire lineage back to the root is stored. This means that finding someone's boss, or their boss' boss, doesn't require any self-joins or recursion to create an indented list. In fact, it can be accomplished with a single SELECT!

你会发现对于每个节点,整个LineAge知道根都存储着。这就意味着找某人的老板,或者老板的老板,不需要自我连接或者递归去建立缩进的列表。事实上,用一个简单的select语句就能完成。

SELECT Space(T.Depth*2) + E.Name AS Name
FROM Employees E 
INNER JOIN Tree T ON E.EmployeeID=T.EmployeeID
ORDER BY T.Lineage + Ltrim(Str(T.Node,6,0))

 If you kept everything in one table you would not even need the JOIN! The Depth column comes in handy for performing the indent by using the Space() function. Using ORDER BY Lineage...etc. will sort the org chart properly, with each subordinate nesting underneath their parent. Sort order is maintained by Node values, and can be changed simply by updating the node value. Inserting or deleting a new node does not affect the rest of the tree, unlike the nested set model. The lineage column can be maintained automatically using triggers, so moving or promoting a node is a no-brainer.

猜你喜欢

转载自2012-for-review.iteye.com/blog/1415991