1. Login
The application can use the Postgresql jdbc driver package to connect to the GreenPlum (GP) database, and log in to the GP from the command line:
wrote
su - gpadmin
// psql -h 192.168.1.2 -d test -U user
psql -h ip -d dbname -U user
// psql -h 192.168.1.2 -d test -U user
psql -h ip -d dbname -U user
2. Create a table
Create a table template as follows, mainly pay attention to the places marked in red:
With: Specify the storage parameters (column or row storage, compression, etc.) when creating a table
Distribute: Specify the specific column (Hash or random) for the data distribution method
Partition: Node data partition (by range conversion or column value partitioning)
With: Specify the storage parameters (column or row storage, compression, etc.) when creating a table
Distribute: Specify the specific column (Hash or random) for the data distribution method
Partition: Node data partition (by range conversion or column value partitioning)
CREATE [[GLOBAL | LOCAL] {TEMPORARY | TEMP}] TABLE table_name(
[ { column_name data_type[ DEFAULT default_expr]
[column_constraint[ ... ]
[ ENCODING ( storage_directive[,...] ) ]
]
| table_constraint
| LIKE other_table[{INCLUDING | EXCLUDING}
{DEFAULTS | CONSTRAINTS}] ...}
[, ... ] ]
)
[ INHERITS ( parent_table[, ... ] ) ]
[ WITH ( storage_parameter=value[, ... ] )
[ ON COMMIT {PRESERVE ROWS | DELETE ROWS | DROP} ]
[ TABLESPACE tablespace]
[ DISTRIBUTED BY (column, [ ... ] ) | DISTRIBUTED RANDOMLY ]
[ PARTITION BY partition_type(column)
[ SUBPARTITION BY partition_type(column) ]
[ SUBPARTITION TEMPLATE ( template_spec ) ]
[...]
( partition_spec)
| [ SUBPARTITION BY partition_type(column) ]
[...]
( partition_spec
[ ( subpartition_spec
[(...)]
) ]
)
/**The storage parameters under With are as follows*/ // true=column storage false=row storage APPENDONLY={TRUE|FALSE} // data block size BLOCKSIZE={8192-2097152} ORIENTATION={COLUMN|ROW} CHECKSUM={TRUE|FALSE} // data compression type COMPRESSTYPE={ZLIB|QUICKLZ|RLE_TYPE|NONE} // Data storage compression level, reduce IO when querying COMPRESSLEVEL={1-9} FILLFACTOR={10-100} // Defaults to false when creating a table OIDS[=TRUE|FALSE]
CREATE TABLE "public"."test" ( "user" varchar(15), "namd" varchar(2), "create" date, "zt" NUMERIC(2) ) WITH (orientation=column,appendonly=true,compresslevel=5) distributed randomly partition by range(create) ( partition p201501 start ('2015-01-01'::date) end ('2015-02-01'::date), partition p201502 start ('2015-02-01'::date) end ('2015-03-01'::date) );
3. Data import
Real-time data writing is slow, and data is generally imported into tables through external tables and files. Also possible to query data via Create table AS or Insert into table is very fast.
GP index type supports Btree, Bitmap
GP index type supports Btree, Bitmap
4. Function
GP is an upgraded product of Postgresql. Most Postgresql functions support analytical functions and window functions in GP.
Date commonly used functions: to_date, to_char, date_part, date_truct
Date commonly used functions: to_date, to_char, date_part, date_truct
5. Query data range of tens of billions of dollars, and optimize the value of grouping sorting window.
6. The abnormal recursive optimization method of distinct xx and count (distinct xx) is written
7, Postgresql data book write
Notes write
1) After creating the partition table, create the relevant index, and do not use the index when performing statistical analysis or query?
Execute onlysize table first
2) Using ORACLE SQL pre-compilation method will not work (the query is slower), it is more efficient to directly use SQL to query in parallel
Execute onlysize table first
2) Using ORACLE SQL pre-compilation method will not work (the query is slower), it is more efficient to directly use SQL to query in parallel