Phoenix secondary index is created

Outline

Global index is an important characteristic of Phoenix, rational use of secondary indexes can reduce query latency, so that cluster resources can be fully utilized. This article describes how efficient design and the use of the index.

Global Index Explanation

Simply through the global index is the index table to store a separate data table HBase data. We look at the relationship between the index data and the main data table by the following examples.

-- 创建数据表CREATE TABLE DATA_TABLE(
  A VARCHAR PRIMARY KEY,
  B VARCHAR,
  C INTEGER,
  D INTEGER);  -- 创建索引CREATE INDEX B_IDX ON DATA_TABLE(B)INCLUDE(C);-- 插入数据UPSERT INTO DATA_TABLE VALUES('A','B',1,2);

When writing data to the main table, the index data will be synchronized to the index table. Primary key index of the table will be the combined value of the index column and the data table's primary key, include the column is stored in the normal column index table, which aims to make queries more efficient, only need to check once the index table will be able to get the data , without having to check back to the main table. The process into the next FIG.

 

Table Phoenix is ​​HBase table, HBase RowKey lexicographic order are stored by binary data, which means higher Row key prefix matching the easier row together.

Global Index Design

We continue to use DATA_TABLE table as an example, create the following composite index. We have previously mentioned Row key index of the table is a dictionary program memory, and what kind of inquiry for such an index structure?

CREATE INDEX B_C_D_IDX ON DATA_TABLE (B, C, D);
all field conditions = operator Example:

 

Note: In the table query conditions and not necessarily in the same order and the combination field index, can be arbitrarily combined.

In actual use, we recommend using only 1 to 4, follow the prefix matching principle, to avoid triggering a full table scan. 5 to 7 conditions necessary to scan the entire table data to filter out data that meets these conditions, it is strongly not recommended.

other

  • For the order by field or group by field is still able to use the secondary index field to speed up queries.

  • Try to avoid build more reasonable index table by the primary key design data table, because the index table to write the more magnified the more serious.

  • You can not use global index after using ROW_TIMESTAMP properties

  • The index table salt use is appropriate characteristics can improve query write performance, avoid hot spots.

Published 57 original articles · won praise 33 · Views 140,000 +

Guess you like

Origin blog.csdn.net/u014156013/article/details/82656357