HiveQL Index Index

Hive only limited indexing. Hive has no notion of common keys in relational databases, but can still be indexed for some fields to speed up certain operations. A table of index data stored in an additional table.

You can view a query whether the index used by the explain command.

First, create an index

For example, the data sheet:
create table employees(
name string,
salary float,
subordinates array<string>,
deductions map<string, float>,
address struct<street:string, city:string, state:string, zip:int>
)
partitioned by (country string, state:string);
Partition indexing field country:
create index employees_index on table employees (country) as 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler'
with deferred rebuild
idxproperties('creator' = 'me', 'create_at' = 'some_time')
in table employees_index_table
partitioned by (country, name)
comment 'Employees indexed by country and name.';
If we omitted entirely partitioned by the statement, then the index will contain all the original partition table.
as ... statement specifies the index processor, which is a Java class that implements the index interface.
Index processors are not necessarily required to retain the index data in a new table, but if needed, will be used to in table ... statement. This sentence provides the same table and create other types of many functions. You can also increase the row format, stored as, stored by , location and other statements before the comment statement.
Currently, in addition to data S3, and views the external tables can all be indexed.
Bitmap Index
Hive v0.8.0 new version of a bitmap index built-in processor, generally applied to the bitmap index column value less duplication. The following is a statement using the bitmap index processor rewriting the previous example:
create index employees_index
on table employees(country)
as 'BITMAP'
with deferred rebuild
idxproperties('creator' = 'me', 'created_at'='some_time')
in table employees_index_table
partitioned by (country, name)
comment 'Employees indexed by country and name.';

Second, rebuilding the index

If the deferred rebuild is specified, then the new index will show a blank. At any time, you can create an index for the first time or use the alter index rebuild the index:
alter index employees_index
on table employees
partition (country = 'US')
rebuild;
If you omit partition, then we will rebuild the index for all partitions.
If the re-indexing fails, before the start of the reconstruction, the index will remain in the previous version status.

Third, the index display

show formatted index on employees;
Keywords formatted is optional. This keyword can increase the output contains the column names. The user may also replace the index for the indexes, so that the output can include the index information.

Fourth, remove the index

If there is an index, then remove an index will delete the index table:
drop index if exists employees_index on table employees;
If the index table is deleted, its corresponding index and index table will be deleted. Similarly, if a partition of the original table is deleted, then the partition corresponding partitioned index will also be deleted.

Guess you like

Origin www.cnblogs.com/xibuhaohao/p/11820734.html