Some rules for building tables in vertica

Anatomy of a Projection

TheCREATE PROJECTIONstatement defines the individual elements of a projection, as the following graphic shows.

 

The previous example contains the following significant elements:

Column List and Encoding

Lists every column in the projection and defines the encoding for each column. Unlike traditional database architectures,HP Verticaoperates on encoded data representations. Therefore,HPrecommends that you use data encoding because it results in less disk I/O.

Base Query

Identifies all the columns to incorporate in the projection through column name and table name references. The base query for large table projections can contain PK/FK joins to smaller tables.

Sort Order

The sort order optimizes for a specific query or commonalities in a class of queries based on the query predicate. The best sort orders are determined by the WHERE clauses. For example, if a projection's sort order is(x, y), and the query's WHERE clause specifies(x=1 AND y=2), all of the needed data is found together in the sort order, so the query runs almost instantaneously.

You can also optimize a query by matching the projection's sort order to the query's GROUP BY clause. If you do not specify a sort order,HP Verticauses the order in which columns are specified in the column definition as the projection's sort order.

The ORDER BY clause specifies a projection's sort order, which localizes logically grouped values so that a disk read can pick up many results at once. For maximum performance, do not sort projections on LONG VARBINARY and LONG VARCHAR columns.

Segmentation

The segmentation clause determines whether a projection is segmented across nodes within the database. Segmentation distributes contiguous pieces of projections, calledsegments, for large and medium tables across database nodes. Segmentation maximizes database performance by distributing the load. Use SEGMENTED BY HASH to segment large table projections.

For small tables, use the UNSEGMENTED keyword to directHP Verticato replicate these tables, rather than segment them. Replication creates and stores identical copies of projections for small tables across all nodes in the cluster. Replication ensures high availability and recovery.

For maximum performance, do not segment projections on LONG VARBINARY and LONG VARCHAR columns.

The above comes from the official website and is understood as follows:

Analysis of Projection

Sort Order

How to select the order by column when building a table in DACP:

1. The data inserted in the table after order by is ordered, so the column of order by is derived from the content of the where clause you use in the query statement. For example, if there is where x=1 and y=2 in the sentence query, then when creating a projection, the order by (x, y) query will quickly locate the qualified data

2. The fields after group by appear in order by, which can also optimize the query.

3, order by do not build on the columns of LONG VARBINARY and LONG VARCHAR

 

Segmentation

1. Segmentation by hash() is to break up the data according to a certain column and evenly distribute the data on each node. For large tables, remember to use it. Therefore, the column in the hash is the best primary key, that is to say, the more values ​​that the column data does not repeat, the more suitable it is for hashing.

2. Do not use LONG VARBINARY and LONG VARCHAR columns for Segmentation by columns.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326823038&siteId=291194637