[Big Data] Hive Series - Hive-DDL Data Definition

library operation

create database

CREATE DATABASE [IF NOT EXISTS] database_name [COMMENT database_comment]
[LOCATION hdfs_path]
[WITH DBPROPERTIES (
	property_name=property_value, 
	...
	)
];
  1. Create a database, the default storage path of the database on HDFS is /hive/warehouse/*.db
 create database hive_nb; 
  1. To avoid errors that the database to be created already exists, add the if not exists judgment. (Standard wording)
 create database if not exists hive_nb;
  1. Create a database and specify where the database is stored on HDFS
 create database hive_nb location '/hive_nb.db'; 

query database

show database

show databases; 

Filter the databases displayed

show databases like 'hive_nb*'; 

View database details

show database information

desc database hive_nb;

Show database details, extended

desc database extended hive_nb; 

switch current database

use hive_nb;

modify database

Users can use the ALTER DATABASE command to set key-value pair property values ​​for DBPROPERTIES of a certain database to describe the property information of this database.

alter database hive_nb
set dbproperties('createtime'='20230313');

delete database

delete empty database

drop database hive_nb; 

If the deleted database does not exist, it is best to use if exists to determine whether the database exists

drop database if exists hive_nb;

delete non-empty database

If the database is not empty, you can use the cascade command to force delete

drop database hive_nb cascade;

table operation

table creation syntax

CREATE [EXTERNAL] TABLE [IF NOT EXISTS] table_name
[(col_name data_type [COMMENT col_comment], ...)] [COMMENT table_comment]
[PARTITIONED BY (col_name data_type [COMMENT col_comment], ...)] [CLUSTERED BY (col_name, col_name, ...)
[SORTED BY (col_name [ASC|DESC], ...)] INTO num_buckets BUCKETS] [ROW FORMAT row_format]
[STORED AS file_format] [LOCATION hdfs_path]
[TBLPROPERTIES (property_name=property_value, ...)] [AS select_statement]

Field Explanation

  • CREATE TABLE creates a table with the specified name. If a table with the same name already exists, an exception is thrown; the user can ignore this exception with the IF NOT EXISTS option

  • The EXTERNAL keyword allows the user to create an external table. When creating the table, a path (LOCATION) pointing to the actual data can be specified. When the table is deleted, the metadata and data of the internal table will be deleted together, while the external table only Remove metadata, not data.

  • COMMENT: Add comments to tables and columns.

  • PARTITIONED BY Create partition table

  • CLUSTERED BY creates a bucket table

  • SORTED BY is not commonly used, and one or more columns in the bucket are additionally sorted

  • ROW FORMAT DELIMITED [FIELDS TERMINATED BY char] [COLLECTION ITEMS TERMINATED BY char] [MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char]
    | SERDE serde_name [WITH SERDEPROPERTIES (property_name=property_value, property_name=property_value, …)]

Users can customize SerDe or use the built-in SerDe when creating a table. If no ROW
FORMAT or ROW FORMAT DELIMITED is specified, the built-in SerDe will be used. When creating a table, the user also needs to specify columns for the table. When specifying the columns of the table, the user also specifies a custom SerDe. Hive determines the specific column data of the table through the SerDe.

SerDe is the abbreviation of Serialize/Deserilize, and hive uses Serde to serialize and deserialize row objects.

  • STORED AS Specifies the storage file type Commonly used storage file types: SEQUENCEFILE (binary sequence file), TEXTFILE (text), RCFILE (column
    storage format file)

If the file data is plain text, you can use STORED AS TEXTFILE. If the data needs to be compressed, use STORED AS SEQUENCEFILE.

  • LOCATION : Specify the storage location of the table on HDFS.

  • AS: followed by a query statement to create a table based on the query results.

  • LIKE allows the user to copy the existing table structure, but not the data.

table type

The tables managed by Hive are divided into internal tables and external tables .

The tables created by default are so-called administrative tables, sometimes called internal tables. Because of this kind of table, Hive will (more or less) control the life cycle of the data. By default, Hive will store the data for these tables in subdirectories of the directory defined by the configuration item hive.metastore.warehouse.dir (for example, /hive/warehouse). When we delete a management table, Hive will also delete the data in this table. Management tables are not suitable for sharing data with other tools.

For external tables, Hive does not consider itself fully owned by the data. Deleting the table will not delete the actual stored data, but the metadata information describing the table will be deleted.

When creating an external table, you need to addexternal

create external table if not exists computer( no int,
name string, price double
)
row format delimited fields terminated by '\t';

Conversion between management tables and external tables

lookup table type

desc formatted phone; 

Modify the internal table phone to an external table

alter table phone set tblproperties('EXTERNAL'='TRUE');

Modify the external table computer to an internal table

alter table computer set tblproperties('EXTERNAL'='FALSE');

modify table

rename table

grammar

ALTER TABLE table_name RENAME TO new_table_name

the case

alter table car rename to big_car;

Add/modify/replace column information

grammar

update column

ALTER TABLE table_name CHANGE [COLUMN] col_old_name col_new_name column_type [COMMENT col_comment] [FIRST|AFTER column_name]

Add and replace columns

ALTER TABLE table_name ADD|REPLACE COLUMNS (col_name data_type [COMMENT col_comment], ...)

Note: ADD means to add a new field, and the field position is behind all columns (before the partition column),
REPLACE means to replace all fields in the table

the case

query table structure

desc car;

add column

alter table car add columns(desc string); 

update column

alter table car change column desc desc_detail string;

replace column

alter table car replace columns(no string, name string, price string);

delete table

drop table car;

I hope it will be helpful to you who are viewing the article, remember to pay attention, comment, and favorite, thank you

Guess you like

Origin blog.csdn.net/u013412066/article/details/129485054