GreenPlum can write external table operations in practice

table of Contents

1. What is a gp writable external table?

2. Examples 

3. Does the external table have transactions?


 


1. What is a gp writable external table?

You need to declare WRITABLE when creating a writable external table. Data can be written to gpfdist or executable programs, and writing to local files is not supported.

  1. The external data URL format of execute type is: EXECUTE'/var/load_scripts/get_log_data.sh'
  2. The execute specifies the data read and write protocol, and /var/load_scripts/get_log_data.sh specifies the executable program.
  3. The execute type not only supports reading external data, but also supports writing external data.
  4. The Greenplum external table implements the read and write of execute type data by executing the defined external execution program: for example, get_log_data.sh, and the pipeline. Read external data: use the standard output of the program as the data source; write data to the outside: use the standard input of the external program as the data in the data table

 

2. Examples 

2.1 Create an external table to view all environment variables when executing commands on each segment. Create an external table as follows

=# create external web table exec_example(id int, name varchar(100), value text) EXECUTE 'env|xargs -I {} echo $GP_SEGMENT_ID={}' format 'TEXT' (DELIMITER '=') LOG ERRORS SEGMENT REJECT LIMIT 10 ROWS;
=# select * from exec_example limit 10;
 id |         name          |                   value                    
----+-----------------------+--------------------------------------------
  3 | GP_USER               | gpadmin
  3 | GP_HADOOP_CONN_JARDIR | lib//hadoop
  3 | LC_MONETARY           | C
  3 | GP_CID                | 0
  3 | GPERA                 | 09877cd46d8003f1_201030102232
  3 | GP_SEG_PG_CONF        | /datap4/gpseg3/postgresql.conf
  3 | SHELL                 | /bin/bash
  3 | GPPERFMONHOME         | /usr/local/greenplum-cc-web-2.0.0-build-32
  3 | SSH_CLIENT            | 10.5***3 28799 22
  3 | LC_NUMERIC            | C

3. Does the external table have transactions?

readable and writable are used to indicate whether the external table is readable or writable. Greenplum's external tables are divided into two types: read-only and write-only. Currently, it does not support simultaneous reading and writing to the same external table. If the external table protocol supports both read and write, such as S3 or GPFDIST, users can use the same URL to create read and write external tables respectively. It should be noted that the writable external table framework itself does not guarantee the characteristics of the data transaction, and the implementation of the specific protocol is required to ensure that the data can be rolled back in case of data errors. For example, the S3 protocol supports the rollback of uploaded but uncommitted data when the external table write operation is interrupted and exited; while the GPFDIST protocol does not support data rollback. After the insertion operation to the writable external table is interrupted and exited, GPFDIST will still save the uploaded data file. Regardless of whether reading an external table or writing an external table, all operations are completed on each Primary Segment, so the external table performs data loading and unloading in parallel in the unit of segment.

 

Guess you like

Origin blog.csdn.net/MyySophia/article/details/113643857