19. ClustrixDB interpretation of the implementation plan

 

 

EXPLAIN statement is used to display ClustrixDB query optimizer (also known as Sierra) how to perform INSERT, SELECT, UPDATE, and DELETE statements. EXPLAIN output with three columns:

  1. Operation  - the completion of a task internal operator
  2. Est Cost -. Estimated cost metric is a wall clock time required to perform the operation in proportion
  3. Est Rows -. Sierra estimated number of rows operator will consider output

They will describe the implementation plan for the realization of declarative SQL statements physical plan. In most cases, each row represents EXPLAIN output in a single operation, the input operation to collect, process, or an indentation in the next line, enter. In other words, most of the statements may explain the highest level of indentation statement first performed, and the whole implementation process is the lowest level of indentation statement.

 

They will describe the implementation plan for the realization of declarative SQL statements physical plan. In most cases, each row represents EXPLAIN output in a single operation, the input operation to collect, process, or an indentation in the next line, enter. In other words, most of the statements may explain the highest level of indentation statement first performed, and the whole implementation process is the lowest level of indentation statement.

 

Creating a Data

To demonstrate EXPLAIN output, we will be tracking customers and sales through the use of a database to define and practice their products. This example is for illustration only, not necessarily a good approach to application design data - good data model of the application will depend on your business needs and usage patterns. This data model focuses on the relationship, rather than the full data consistency model.

We will start this basic data model. (Download scripts used here.)

sql> CREATE TABLE customers (
         c_id INTEGER AUTO_INCREMENT
       , name VARCHAR(100) 
       , address VARCHAR(100)
       , city VARCHAR(50)
       , state CHAR(2)
       , zip CHAR(10)
       , PRIMARY KEY c_pk (c_id)
      ) /*$ SLICES=3 */;
sql> CREATE TABLE products (
         p_id INTEGER AUTO_INCREMENT
       , name VARCHAR(100)
       , price DECIMAL(5,2)
       , PRIMARY KEY p_pk (p_id)
      ) /*$ SLICES=3 */;
sql> CREATE TABLE orders (
         o_id INTEGER AUTO_INCREMENT
       , c_id INTEGER
       , created_on DATETIME
       , PRIMARY KEY o_pk (o_id)
       , KEY c_fk (c_id)
       , CONSTRAINT FOREIGN KEY c_fk (c_id) REFERENCES customers (c_id)
      ) /*$ SLICES=3 */;
sql> CREATE TABLE order_items (
         oi_id INTEGER AUTO_INCREMENT
       , o_id INTEGER
       , p_id INTEGER
       , PRIMARY KEY oi_pk (oi_id)
       , KEY o_fk (o_id)
       , KEY p_fk (p_id)
       , CONSTRAINT FOREIGN KEY order_fk (o_id) REFERENCES orders (o_id)
       , CONSTRAINT FOREIGN KEY product_fk (p_id) REFERENCES products (p_id)
      )  /*$ SLICES=3 */;

After populating the database, there are 1,000 customers, 100 products, 4,000 orders and around 10,000 order_items.

 

View the execution plan

Let us start from a simple query, which provides all the information about all of our customers.

sql> EXPLAIN SELECT * FROM customers; 
+------------------------------------------------------+-----------+-----------+
| Operation                                            | Est. Cost | Est. Rows |
+------------------------------------------------------+-----------+-----------+
| stream_combine                                       |    712.70 |   1000.00 |
|   index_scan 1 := customers.__idx_customers__PRIMARY |    203.90 |    333.33 |
+------------------------------------------------------+-----------+-----------+

Typically, you can start reading the innermost indentation explain output, and then retracted downwardly in their own way, the final output end of the first row. After reading the above explanation output, what happens first is to perform operations on the index index_scan customers. Primary key index, the name and "1" is assigned to read the results. In the present embodiment, the name is no longer used. Please note that while there are 1,000 client relationships, it is estimated that the number of rows is about 333. This is because each read index_scan distributed in the cluster subset of data, which we call sheet. In this mode, the relationship between, there are three sheets, the three index_scan run in parallel operation, to collect customer information. Three output index_scan stream_combine operation is transmitted to the operator, as the name suggests, the operator will streams into one stream, to be passed to the client. stream_combine operator works is simply to copy the entire contents of a single input stream to the output, and continue until all streams have been combined.

Let's see if you add restrictions to the query what would happen.

sql>  EXPLAIN SELECT * FROM customers LIMIT 10;  
+----------------------------------------------------------+-----------+-----------+
| Operation                                                | Est. Cost | Est. Rows |
+----------------------------------------------------------+-----------+-----------+
| row_limit LIMIT := param(0)                              |    615.70 |     10.00 |
|   stream_combine                                         |    615.70 |     30.00 |
|     row_limit LIMIT := param(0)                          |    203.90 |     10.00 |
|       index_scan 1 := customers.__idx_customers__PRIMARY |    203.90 |    333.33 |
+----------------------------------------------------------+-----------+-----------+

Here execution plan and the execution plan row_limit added before substantially the same operator. row_limit operator receives an input stream, and closes the input stream after satisfying the constraint (and offset). Since there are three parallel streams, the Sierra row_limit operator copies the "push-down" to each index_scan stream, since no more than 10 rows read from each sheet. After the merger flow, we again restrict output in order to obtain a 10-line client requests.

Suppose we want to sort the results.

sql> EXPLAIN SELECT * FROM customers ORDER BY c_id;
+------------------------------------------------------+-----------+-----------+
| Operation                                            | Est. Cost | Est. Rows |
+------------------------------------------------------+-----------+-----------+
| stream_merge KEYS=[(1 . "c_id") ASC]                 |    816.70 |   1000.00 |
|   index_scan 1 := customers.__idx_customers__PRIMARY |    203.90 |    333.33 |
+------------------------------------------------------+-----------+-----------+

The plan is similar to unsorted version, in addition to this there is a stream_merge to merge the results, rather than stream_combine. stream_merge operator works according to the order provided in the next row all incoming streams pull output stream. In the present embodiment, the ascending order of the column is c_id, thus all the streams will pop stream_merge Comparative smallest row.

In ClustrixDB cluster, data is usually distributed hash on each node, since stream_combine return data arrives first, and thus may vary without distributed database, and always in order to read data. E.g:

sql> SELECT * FROM customers LIMIT 10; 
+------+---------------------+--------------+-------------+-------+-------+
| c_id | name                | address      | city        | state | zip   |
+------+---------------------+--------------+-------------+-------+-------+
|    1 | Chanda Nordahl      | 4280 Maple   | Greenville  | WA    | 98903 |
|    2 | Dorinda Tomaselli   | 8491 Oak     | Centerville | OR    | 97520 |
|    9 | Minerva Donnell     | 4644 1st St. | Springfield | WA    | 98613 |
|   21 | Chanda Nordahl      | 5090 1st St. | Fairview    | OR    | 97520 |
|    4 | Dorinda Hougland    | 8511 Pine    | Springfield | OR    | 97477 |
|    6 | Zackary Velasco     | 6296 Oak     | Springfield | OR    | 97286 |
|   11 | Tennie Soden        | 7924 Maple   | Centerville | OR    | 97477 |
|    3 | Shawnee Soden       | 4096 Maple   | Ashland     | WA    | 98035 |
|   24 | Riley Soden         | 7470 1st St. | Greenville  | WA    | 98613 |
|   12 | Kathaleen Tomaselli | 8926 Maple   | Centerville | OR    | 97477 |
+------+---------------------+--------------+-------------+-------+-------+

Repeat this query may get different results. By adding ORDER By clause in the statement, we can ensure consistent results. To make things more interesting, we will change the order from ascending to descending order.

sql> EXPLAIN SELECT * FROM customers ORDER BY c_id DESC LIMIT 10; 
+------------------------------------------------------------------+-----------+-----------+
| Operation                                                        | Est. Cost | Est. Rows |
+------------------------------------------------------------------+-----------+-----------+
| row_limit LIMIT := param(0)                                      |    622.70 |     10.00 |
|   stream_merge KEYS=[(1 . "c_id") DSC]                           |    622.70 |     30.00 |
|     row_limit LIMIT := param(0)                                  |    203.90 |     10.00 |
|       index_scan 1 := customers.__idx_customers__PRIMARY REVERSE |    203.90 |    333.33 |
+------------------------------------------------------------------+-----------+-----------+

We can see from the implementation of this plan, parallel database will first read the main index and after use row_limit operator index_scan operators to stop reading found in all 10 lines available deflector piece, merge these streams by selecting the maximum value from each c_id stream_merge stream operator, the final limit, 10 rows by repeatedly applying row_limit operator.

 

JOIN implementation plan

So far, we have been studying the relationship between a single reading. One of Sierra's work is to compare the cost of different connection order, and choose the lowest cost plan. Each row of this query will generate orders order_items id, product name and price.

       
1.   | sql> EXPLAIN SELECT o_id, name, price FROM orders o NATURAL JOIN order_items NATURAL JOIN products;                         
2.   +-------------------------------------------------------------------------------+-----------+-----------+
3.   | Operation                                                                     | Est. Cost | Est. Rows |
4.   +-------------------------------------------------------------------------------+-----------+-----------+
5.   | nljoin                                                                        |  95339.90 |   9882.00 |
6.   |   nljoin                                                                      |  50870.90 |   9882.00 |
7.   |     stream_combine                                                            |     82.70 |    100.00 |
8.   |       index_scan 3 := products.__idx_products__PRIMARY                        |     23.90 |     33.33 |
9.   |     nljoin                                                                    |    507.88 |     98.82 |
10. |       index_scan 2 := order_items.p_fk, p_id = 3.p_id                         |     63.19 |     98.82 |
11. |       index_scan 2 := order_items.__idx_order_items__PRIMARY, oi_id = 2.oi_id |      4.50 |      1.00 |
12. |   index_scan 1 := orders.__idx_orders__PRIMARY, o_id = 2.o_id                 |      4.50 |      1.00 |
13. +-------------------------------------------------------------------------------+-----------+-----------+

The plan is slightly more complex, requires a little more explanation to see what happens.

  1. Given indentation, we can infer index_scan will occur first. In the interpretation of the output, we can see p_id index_scan found when using the reading line 8 primary key products p_fk index order_items oi_id is read when using the No. 10 line 11, line order_items primary key index. In fact, the product was collected by stream_combine operator, while the information is obtained by order_items NLJOIN order_items collected. p_fk order_items and primary key index.
  2. nljoin operator is a nested loop join, which implements the connection relationship.
  3. Order_items nljoin product stream_combine and then coupled to an output of another NLJOIN.
  4. order_items. o_id for reading the order, all the results are placed in the final nljoin.

Check final row estimate of nljoin let us know, focus on this particular data, Sierra believes there are about 9882 order_items line.

Stage Operation Lookup/Scan representation Lookup/Scan Key Run on Node
1 Lookup and Forward __idx_products__PRIMARY none (all nodes with slices) The node where the query begins
 
2.1 Index Scan __idx_products__PRIMARY None, all rows Nodes with slices of __idx_products__PRIMARY
2.2 Lookup and Forward p_fk p_id = 3.p_id same
 
3.1 Index Scan p_fk p_id = 3.p_id Nodes with slices of p_fk
3.2 Join     same
3.3 Lookup and Forward __idx_order_items__PRIMARY oi_id = 2.oi_id same
 
4.1 Index Scan __idx_order_items__PRIMARY oi_id = 2.oi_id Nodes with slices of __idx_order_items__PRIMARY
4.2  Join     same
4.3  Lookup and Forward __idx_orders__PRIMARY o_id = 2.o_id same
 
5.1 Index Scan __idx_orders__PRIMARY o_id = 2.o_id Nodes with slices of __idx_orders__PRIMARY
5.2 Join      
5.3 Lookup and Forward GTM none - single GTM node  
 
6 Return to user     The node where the query began

 

lock

ClustrixDB using two-phase locking (2PL) as concurrency control to ensure serializability. In the transaction, Sierra will write and update operational plans locks. First, we'll look at a simple update, it is greater than 10 will increase the value price of 1.

sql> EXPLAIN UPDATE products SET price = price + 1 WHERE price > 10;
+-------------------------------------------------------------------------+-----------+-----------+
| Operation                                                               | Est. Cost | Est. Rows |
+-------------------------------------------------------------------------+-----------+-----------+
| table_update products                                                   |   1211.58 |     81.00 |
|   compute expr0 := (1.price + param(0))                                 |   1211.58 |     81.00 |
|     filter (1.price > param(1))                                         |   1210.50 |     81.00 |
|       nljoin                                                            |   1208.70 |     90.00 |
|         pk_lock "products" exclusive                                    |    803.70 |     90.00 |
|           stream_combine                                                |     83.70 |     90.00 |
|             filter (1.price > param(1))                                 |     24.57 |     30.00 |
|               index_scan 1 := products.__idx_products__PRIMARY          |     23.90 |     33.33 |
|         index_scan 1 := products.__idx_products__PRIMARY, p_id = 1.p_id |      4.50 |      1.00 |
+-------------------------------------------------------------------------+-----------+-----------+

In this query plan:

  1. We use the primary key index index_scan read the product, and send the output to a "pushdown" filter, the filter is discarded each price of not more than 10 rows.
  2. Then, these outputs stream_combine combined, and the flow profile across the cluster, using pk_lock operator to acquire an exclusive lock on the primary key of the row is found.
  3. Then, we can use p_id find and use another index_scan read the primary key index.
  4. As the row is found in the first index_scan price may change after reading that line and get the lock, so the filter will be applied again.
  5. Matching rows operator is transmitted to the new calculated value calculated price, and the new row is written to a new value table_update operator.

In some cases, a separate row locks than simply acquire a single table for each lock row change and modify all qualified rows much more expensive. Sierra optimizer will consider using a table lock during a planned exploration instead of row locks and choose the lowest cost plan. In the present embodiment, the line 100 is usually small, does not require the use of a table lock, but if you choose to use the table lock Sierra, the program as shown below.

sql> EXPLAIN UPDATE products SET price = price + 1; 
+------------------------------------------------------------+-----------+-----------+
| Operation                                                  | Est. Cost | Est. Rows |
+------------------------------------------------------------+-----------+-----------+
| table_locks 1                                              |   8084.03 |    100.00 |
|   table_update products                                    |     84.03 |    100.00 |
|     stream_combine                                         |     84.03 |    100.00 |
|       compute expr0 := (1.price + param(0))                |     24.34 |     33.33 |
|         table_lock "products" exclusive                    |     23.90 |     33.33 |
|           index_scan 1 := products.__idx_products__PRIMARY |     23.90 |     33.33 |
+------------------------------------------------------------+-----------+-----------+

Interestingly, index_scan looks like the input table_lock. It is not the case, because the table will lock before reading. With this in mind, we can see the plan:

  1. Read all lines relations with index_scan.
  2. Use compute the increase in the price of 1.
  3. Use stream_combine these results into a single stream.
  4. Sends output to table_update to write the new value.

Sierra table_lock operator is an operator assistance, having a heuristic method, in other cases can be updated to be blocked, the balance relatively inexpensive single lock, wall-clock time consuming during the transaction.

 

Use the index to improve performance

So far, we only examined the primary key index reading to get the results. We can index a given workload meaningful changed by adding this. For example, suppose we have a business process, customer information if we will sort by zip code and consolidated into small pieces, then the effect of the business processes work better. To obtain this information:

sql> EXPLAIN SELECT name, address, city, state, zip FROM customers ORDER BY zip LIMIT 10 OFFSET 0;
+----------------------------------------------------------+-----------+-----------+
| Operation                                                | Est. Cost | Est. Rows |
+----------------------------------------------------------+-----------+-----------+
| row_limit LIMIT := param(0)                              |   2632.70 |     10.00 |
|   sigma_sort KEYS=[(1 . "zip") ASC]                      |   2632.70 |   1000.00 |
|     stream_combine                                       |    712.70 |   1000.00 |
|       index_scan 1 := customers.__idx_customers__PRIMARY |    203.90 |    333.33 |
+----------------------------------------------------------+-----------+-----------+

It reads the primary key index, the combined result, and then sends them to the row sigma_sort operator. sigma_sort operator needed to build a temporary storage container or in memory, to sort the rows ZIP found. Once all the results have been sorted, they will be transmitted to the operator to perform row_limit limit and offset.

If we read the zip code order, instead of reading all rows, sort of postal code, and then return the next batch of 10 rows, then we can significantly improve the performance here. To this end, we add an index on customers.zip, Sierra and see how to change the execution plan.

sql> ALTER TABLE customers ADD INDEX (zip); 
sql> EXPLAIN SELECT name, address, city, state, zip FROM customers ORDER BY zip LIMIT 10 OFFSET 0;
+---------------------------------------------------------------------+-----------+-----------+
| Operation                                                           | Est. Cost | Est. Rows |
+---------------------------------------------------------------------+-----------+-----------+
| msjoin KEYS=[(1 . "zip") ASC]                                       |    674.70 |     10.00 |
|   row_limit LIMIT := param(0)                                       |    622.70 |     10.00 |
|     stream_merge KEYS=[(1 . "zip") ASC]                             |    622.70 |     30.00 |
|       row_limit LIMIT := param(0)                                   |    203.90 |     10.00 |
|         index_scan 1 := customers.zip                               |    203.90 |    333.33 |
|   index_scan 1 := customers.__idx_customers__PRIMARY, c_id = 1.c_id |      4.50 |      1.00 |
+---------------------------------------------------------------------+-----------+-----------+ 

Here, the query optimizer options:

  1. Use index_scan operator are read in parallel on all custom.zip index sheet.
  2. Use "pushdown" operator limit ROW_LIMIT results.
  3. These results were combined using stream_merge holding operator sequence.
  4. Use another row_limit restrictions result of the merger.
  5. Use the index to find c_id zip to read the rest of the line.
  6. Use msjoin operator performs peer connection.

msjoin operator is a "nested loop join merge sort", which is similar to NLJOIN, but retain the sort order during connection. Note that in this program, the sort order of zip indexes will be read, and remain unchanged throughout the program, thereby eliminating the need to create sigma container to sort the results. In other words, the plan in the course of the implementation of all the results fluidized, which is an important factor to consider when reading millions of lines of data.

 

 

polymerization

When using a relational database, another common task is to calculate the data screening large sum, average, minimum, or maximum. These queries are performed by adding a GROUP by clause to the statement, the clause specifies how you want to aggregate data. ClustrixDB also achieved extension of a GROUP BY MySQL, comprising a non-polymeric allow columns in the output column. If there is no one to one relationship between the GROUP BY columns and columns of non-polymeric, non-polymeric columns so the value will be a value in the row, but does not define what the return value. Because our data there is one mapping between the zip and the state, so we can generate a result set to generate a map for us.

sql> EXPLAIN SELECT zip, state FROM customers GROUP BY zip; 
+--------------------------------------------------------+-----------+-----------+
| Operation                                              | Est. Cost | Est. Rows |
+--------------------------------------------------------+-----------+-----------+
| sigma_distinct_combine KEYS=((1 . "zip"))              |   1303.90 |   1000.00 |
|   sigma_distinct_partial KEYS=((1 . "zip"))            |    203.90 |   1000.00 |
|     index_scan 1 := customers.__idx_customers__PRIMARY |    203.90 |    333.33 |
+--------------------------------------------------------+-----------+-----------+

This query:

  1. First, a index_scan and outputs to a sigma_distinct_partial operator.
  2. sigma_distinct_partial operator, which it reads one line of output values ​​on different nodes on the same key.
  3. Then sends these values ​​to sigma_distinct_combine different operator, the operator will initiate a key on the same node query performing different operations.

For more practical polymerization, let us assume that we want to look at how much each customer orders and the customer's name.

sql> EXPLAIN SELECT c.name, COUNT(*) FROM orders o NATURAL JOIN customers c GROUP BY o.c_id; 
+-------------------------------------------------------------------------------+-----------+-----------+
| Operation                                                                     | Est. Cost | Est. Rows |
+-------------------------------------------------------------------------------+-----------+-----------+
| hash_aggregate_combine GROUPBY((1 . "c_id")) expr1 := countsum((0 . "expr1")) |  12780.38 |   4056.80 |
|   hash_aggregate_partial GROUPBY((1 . "c_id")) expr1 := count((0 . "expr0"))  |   7100.87 |   4056.80 |
|     compute expr0 := param(0)                                                 |   7100.87 |   4056.80 |
|       nljoin                                                                  |   7046.78 |   4056.80 |
|         stream_combine                                                        |    712.70 |   1000.00 |
|           index_scan 2 := customers.__idx_customers__PRIMARY                  |    203.90 |    333.33 |
|         index_scan 1 := orders.c_fk, c_id = 2.c_id                            |      6.33 |      4.06 |
+-------------------------------------------------------------------------------+-----------+-----------+

In this plan:

  1. The first is index_scan customer master key and combination with stream_combine, then use c_id read the order. With c_fk index of another index_scan.
  2. These results are connected to the node we read the order. Nljoin operator using established c_fk index, and using the same node are grouped hash_aggregate_partial operator and counting.
  3. Then, the results are sent to hash_aggregate_combine operator on the original node, and to obtain a final count of the group, and then returned to the user lines.

 

to sum up

Hopefully this is a full introduction to interpret the output of ClustrixDB Sierra query optimizer, you can use the optimizer to check your own query. For a complete list operator may appear in the EXPLAIN, see Planner list of operators. For more information on how to query optimization Sierra conduct, see query optimizer distributed database architecture.

 

Guess you like

Origin www.cnblogs.com/yuxiaohao/p/11984494.html