GreenPlum About Getting Started

GreenPlum is a distributed database is the bottom part table PostgreSQL multiple sub-library, it has the following characteristics

  • Supports standard SQL, PostgreSQL supports almost all SQL, greenplum support
  • Support for ACID, distributed transactions
  • Support hundreds of clusters (this is a little bad, hadoop can million units)

system structure

image

Master Host

  • Processing user requests, execution plan, and to perform the necessary polymerization operation (avg) ordering the execution plan or
  • The interior has a PostgreSQL database, save all metadata, indexing information
  • Monitor the status of all segment

Segment host

  • Each Segment host a plurality of segment, segment is generally equal to the number of core
  • segment is a PostgreSQL database, is responsible for storing specific data

Internal Network

Internal network udp GreenPlum use, but will Greenplum check packets, the reliability is equivalent to TCP. When using TCP, it supports up to 1000 segment

Implementation plan

When the master receives a SQL statement, this statement will resolve to implement the plan DAG, DAG does not need to be divided into slice data exchange, multi-table joins, aggerate, sort of, that it will involve the redistribution of slice, there will be a motion task to perform re-distribution of the data. The slice issued to the associated segment involved.

I think that the slice is similar to the concept stage of the Spark, no data shuffle

motion mode

  • gather motion (N-> 1): on the master node to all the segment data gathered, generally sort, sort group, sort join
  • boardcast motion (N-> N): Each segment is broadcast to all of the remaining data segment
  • redistribute motion (N-> N): the data of each segment according to the way hash redistribution

 

 



Author: Liangqiu _ but not the arrival of spring
link: https: //www.jianshu.com/p/9be1439f5bd3
Source: Jane book
Jane book copyright reserved by the authors, are reproduced in any form, please contact the author to obtain authorization and indicate the source.

Guess you like

Origin blog.csdn.net/oZuoLuo123/article/details/88370307