GreenPlum usage logs

1. Login
 The application can use the Postgresql jdbc driver package to connect to the GreenPlum (GP) database, and log in to the GP from the command line:

 

wrote
su - gpadmin
// psql -h 192.168.1.2 -d test -U user
psql -h ip -d dbname -U user

 

2. Create a table
Create a table template as follows, mainly pay attention to the places marked in red:
With: Specify the storage parameters (column or row storage, compression, etc.) when creating a table
Distribute: Specify the specific column (Hash or random) for the data distribution method
Partition: Node data partition (by range conversion or column value partitioning)

CREATE [[GLOBAL | LOCAL] {TEMPORARY | TEMP}] TABLE table_name(
[ { column_name data_type[ DEFAULT default_expr]
[column_constraint[ ... ]
[ ENCODING ( storage_directive[,...] ) ]
]
| table_constraint
| LIKE other_table[{INCLUDING | EXCLUDING}
{DEFAULTS | CONSTRAINTS}] ...}
[, ... ] ]
)
[ INHERITS ( parent_table[, ... ] ) ]
[ WITH ( storage_parameter=value[, ... ] )
[ ON COMMIT {PRESERVE ROWS | DELETE ROWS | DROP} ]
[ TABLESPACE tablespace]
[ DISTRIBUTED BY (column, [ ... ] ) | DISTRIBUTED RANDOMLY ]
[ PARTITION BY partition_type(column)
[ SUBPARTITION BY partition_type(column) ]
[ SUBPARTITION TEMPLATE ( template_spec ) ]
[...]
( partition_spec)
| [ SUBPARTITION BY partition_type(column) ]
[...]
( partition_spec
[ ( subpartition_spec
[(...)]
) ]
)

 

/**The storage parameters under With are as follows*/
// true=column storage false=row storage
APPENDONLY={TRUE|FALSE}
// data block size
BLOCKSIZE={8192-2097152}
ORIENTATION={COLUMN|ROW}
CHECKSUM={TRUE|FALSE}
// data compression type
COMPRESSTYPE={ZLIB|QUICKLZ|RLE_TYPE|NONE}
// Data storage compression level, reduce IO when querying
COMPRESSLEVEL={1-9}
FILLFACTOR={10-100}
// Defaults to false when creating a table
OIDS[=TRUE|FALSE]

 

CREATE TABLE "public"."test" (
"user" varchar(15),
"namd" varchar(2),
"create" date,
"zt" NUMERIC(2)
)
WITH (orientation=column,appendonly=true,compresslevel=5)
distributed randomly
partition by range(create)
(
	partition p201501 start ('2015-01-01'::date) end ('2015-02-01'::date),
	partition p201502 start ('2015-02-01'::date) end ('2015-03-01'::date)
);

 

3. Data import
  Real-time data writing is slow, and data is generally imported into tables through external tables and files. Also possible to query data via Create table AS or Insert into table is very fast.
  GP index type supports Btree, Bitmap

 

4. Function 
  GP is an upgraded product of Postgresql. Most Postgresql functions support analytical functions and window functions in GP.
Date commonly used functions: to_date, to_char, date_part, date_truct

    

5. Query data range of tens of billions of dollars, and optimize the value of grouping sorting window.

 

6. The abnormal recursive optimization method of distinct xx and count (distinct xx) is written

  

7, Postgresql data book write

 

Notes write
1) After creating the partition table, create the relevant index, and do not use the index when performing statistical analysis or query?
Execute onlysize table first
2) Using ORACLE SQL pre-compilation method will not work (the query is slower), it is more efficient to directly use SQL to query in parallel

 

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326646373&siteId=291194637