Closure tables for hierarchical structure

5.2.2 Closure tables 

A closure table is a SQL table which contains a record for every employee/supervisor relationship, regardless of depth. (In mathematical terms, this is called the 'reflexive transitive closure' of the employee/supervisor relationship. The distance column is not strictly required, but it makes it easier to populate the table.)

employee_closure
supervisor_id employee_id distance
1 1 0
1 2 1
1 3 2
1 4 1
1 5 3
1 6 2
2 2 0
2 3 1
2 5 2
2 6 1
3 3 0
3 5 1
4 4 0
5 5 0
6 6 0

In the catalog XML, the <Closure> element maps the level onto a <Table>:

< Dimension name="Employees" foreignKey="employee_id">
< Hierarchy hasAll="true" allMemberName="All Employees" primaryKey="employee_id">
< Table name="employee"/>
< Level name="Employee Id" uniqueMembers="true" type="Numeric"
column="employee_id" nameColumn="full_name" parentColumn="supervisor_id" nullParentValue="0">
< Closure parentColumn="supervisor_id" childColumn="employee_id">
< Table name="employee_closure"/>
</ Closure>
< Property name="Marital Status" column="marital_status"/>
< Property name="Position Title" column="position_title"/>
< Property name="Gender" column="gender"/>
< Property name="Salary" column="salary"/>
< Property name="Education Level" column="education_level"/>
< Property name="Management Role" column="management_role"/>
</ Level>
</ Hierarchy>
</ Dimension>

This table allows totals to be evaluated in pure SQL. Even though this introduces an extra table into the query, database optimizers are very good at handling joins. I recommend that you declare both supervisor_id and employee_id NOT NULL, and index them as follows:

CREATE UNIQUE INDEX employee_closure_pk ON employee_closure (
   supervisor_id,
   employee_id);
CREATE INDEX employee_closure_emp ON employee_closure (
   employee_id);

5.2.3 Populating closure tables 

The table needs to be re-populated whenever the hierarchy changes, and it is the application's responsibility to do so — Mondrian does not do this!

If you are using Pentaho Data Integration (Kettle), there is a special step to populate closure tables as part of the ETL process. Further details in the Pentaho Data Integration wiki.

 

Closure Generator step in Pentaho Data Integration

Closure Generator step in Pentaho Data Integration

If you are not using Pentaho Data Integration, you can populate the table yourself using SQL. Here is an example of a MySQL stored procedure that populates a closure table.

DELIMITER //

CREATE PROCEDURE populate_employee_closure()
BEGIN
  DECLARE distance int;
  TRUNCATE TABLE employee_closure;
  SET distance = 0;
  -- seed closure with self-pairs (distance 0)
  INSERT INTO employee_closure (supervisor_id, employee_id, distance)
    SELECT employee_id, employee_id, distance
      FROM employee;

  -- for each pair (root, leaf) in the closure,
  -- add (root, leaf->child) from the base table

  REPEAT
    SET distance = distance + 1;
    INSERT INTO employee_closure (supervisor_id, employee_id, distance)
      SELECT employee_closure.supervisor_id, employee.employee_id, distance
        FROM employee_closure, employee
          WHERE employee_closure.employee_id = employee.supervisor_id
          AND employee_closure.distance = distance - 1;
  UNTIL (ROW_COUNT() == 0))
  END REPEAT;
END //

DELIMITER ;

Guess you like

Origin www.cnblogs.com/hejiang/p/11307759.html