数据存储---GreenPlum实践

       GreenPlum作为一个很好的开源的数据仓库,一直想体验一把,最近装了一个.关于安装,网上资料介绍也多了不少,至少越来越多,不过真正装一把才知道还是有各种坑的,不是那么一帆风顺的.

       Greenplum在线安装文档 http://gpdb.docs.pivotal.io/500/install_guide/prep_os_install_gpdb.html#topic8

       整个文档的确是相当长,而且还需要一点E文和Linux.我也没有完全按照上面说的来操作.下面记录下基本的操作和遇到的一些问题.

       1, 调整系统参数,具体的不贴了,文档上面也有,Centos6.5和7上面的操作还是有些区别的

  vi /etc/sysctl.conf
  vi /etc/security/limits.conf

       2, 防火墙和HOSTS配置,这些操作看似简单,但是最容易出错,一定要注意检查.

       3, 下载,解压,安装,具体步骤省略......

      gpadmin用户下

      vi ~/.bash_profile 

      source /usr/local/greenplum-db/greenplum_path.sh
      export MASTER_DATA_DIRECTORY=/data/master/gpseg-1

  或者 source /usr/local/greenplum-db/greenplum_path.sh

      指定hostfile_exkeys文件,里面包括所有的主机,这个文件放的位置可以随意,只要下面执行的能找到文件或者参数里面把路径加上就可以

  gpseginstall -f hostfile_exkeys

     确认安装情况

gpssh -f hostfile_exkeys -e ls -l $GPHOME

gpssh-exkeys -f hostfile_exkeys


MASTER上创建数据存储目录
mkdir /data/master
chown gpadmin /data/master

Segment上创建数据存储目录
gpssh -f hostfile_gpssh_segonly -e 'mkdir /data/primary'
gpssh -f hostfile_gpssh_segonly -e 'mkdir /data/mirror'
gpssh -f hostfile_gpssh_segonly -e 'chown gpadmin /data/primary'
gpssh -f hostfile_gpssh_segonly -e 'chown gpadmin /data/mirror'

     4, 修改配置文件

     cd /home/gpadmin/gpconfigs (没有就创建目录)

    vi gpinitsystem_config

ARRAY_NAME="EMC Greenplum DW"
SEG_PREFIX=gpseg
PORT_BASE=40000

declare -a DATA_DIRECTORY=(/data/primary)

MASTER_HOSTNAME=vm11
MASTER_DIRECTORY=/data/master
MASTER_PORT=5432

TRUSTED_SHELL=ssh
CHECK_POINT_SEGMENTS=8
ENCODING=UNICODEdeclare -a MIRROR_DATA_DIRECTORY=(/data/mirror)

DATABASE_NAME=gpexmp
MACHINE_LIST_FILE=/home/gpadmin/seg_hosts

    我把所有的注释都去掉了,这里面几个是一定要配置的,需要注意!

      5, 开始初始化GP.

     gpinitsystem -c /home/gpadmin/gpconfigs/gpinitsystem_config  执行之后的日志是这样的:



[gpadmin@vm11 ~]$ gpinitsystem -c /home/gpadmin/gpconfigs/gpinitsystem_config
20171006:06:17:43:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Checking configuration parameters, please wait...
20171006:06:17:43:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Reading Greenplum configuration file /home/gpadmin/gpconfigs/gpinitsystem_config
20171006:06:17:43:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Locale has not been set in /home/gpadmin/gpconfigs/gpinitsystem_config, will set to default value
20171006:06:17:44:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Locale set to en_US.utf8
20171006:06:17:44:109203 gpinitsystem:vm11:gpadmin-[INFO]:-MASTER_MAX_CONNECT not set, will set to default value 250
20171006:06:17:44:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Checking configuration parameters, Completed
20171006:06:17:44:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Commencing multi-home checks, please wait...
...
20171006:06:17:45:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Configuring build for standard array
20171006:06:17:45:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Commencing multi-home checks, Completed
20171006:06:17:45:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Building primary segment instance array, please wait...
...
20171006:06:17:47:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Checking Master host
20171006:06:17:47:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Checking new segment hosts, please wait...
..20171006:06:17:52:109203 gpinitsystem:vm11:gpadmin-[WARN]:-----------------------------------------------------------
20171006:06:17:52:109203 gpinitsystem:vm11:gpadmin-[WARN]:-Host vm22 is assigned as localhost in /etc/hosts
20171006:06:17:52:109203 gpinitsystem:vm11:gpadmin-[WARN]:-This will cause segment->master communication failures
20171006:06:17:52:109203 gpinitsystem:vm11:gpadmin-[WARN]:-Remove vm22 from local host line in /etc/hosts
20171006:06:17:52:109203 gpinitsystem:vm11:gpadmin-[WARN]:-----------------------------------------------------------
.
20171006:06:17:54:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Checking new segment hosts, Completed
20171006:06:17:54:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Greenplum Database Creation Parameters
20171006:06:17:54:109203 gpinitsystem:vm11:gpadmin-[INFO]:---------------------------------------
20171006:06:17:54:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Master Configuration
20171006:06:17:54:109203 gpinitsystem:vm11:gpadmin-[INFO]:---------------------------------------
20171006:06:17:54:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Master instance name       = EMC Greenplum DW
20171006:06:17:54:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Master hostname            = vm11
20171006:06:17:54:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Master port                = 5432
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Master instance dir        = /data/master/gpseg-1
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Master LOCALE              = en_US.utf8
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Greenplum segment prefix   = gpseg
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Master Database            = gpexmp
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Master connections         = 250
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Master buffers             = 128000kB
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Segment connections        = 750
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Segment buffers            = 128000kB
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Checkpoint segments        = 8
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Encoding                   = UNICODE
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Postgres param file        = Off
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Initdb to be used          = /usr/local/greenplum-db/./bin/initdb
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-GP_LIBRARY_PATH is         = /usr/local/greenplum-db/./lib
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Ulimit check               = Passed
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Array host connect type    = Single hostname per node
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Master IP address [1]      = 192.168.122.1
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Master IP address [2]      = 192.168.37.128
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Master IP address [3]      = fe80::20c:29ff:feb1:4c90
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Standby Master             = Not Configured
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Primary segment #          = 1
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Total Database segments    = 3
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Trusted shell              = ssh
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Number segment hosts       = 3
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Mirroring config           = OFF
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:----------------------------------------
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Greenplum Primary Segment Configuration
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:----------------------------------------
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-vm11         /data/primary/gpseg0    40000   2       0
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-vm22         /data/primary/gpseg1    40000   3       1
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-vm33         /data/primary/gpseg2    40000   4       2
Continue with Greenplum creation Yy/Nn>
Y
20171006:06:18:22:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Building the Master instance database, please wait...
20171006:06:18:30:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Starting the Master in admin mode
20171006:06:18:56:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Commencing parallel build of primary segment instances
20171006:06:18:57:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Spawning parallel processes    batch [1], please wait...
...
20171006:06:18:57:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Waiting for parallel processes batch [1], please wait...
............................
20171006:06:19:26:109203 gpinitsystem:vm11:gpadmin-[INFO]:------------------------------------------------
20171006:06:19:26:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Parallel process exit status
20171006:06:19:26:109203 gpinitsystem:vm11:gpadmin-[INFO]:------------------------------------------------
20171006:06:19:26:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Total processes marked as completed           = 3
20171006:06:19:26:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Total processes marked as killed              = 0
20171006:06:19:26:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Total processes marked as failed              = 0
20171006:06:19:26:109203 gpinitsystem:vm11:gpadmin-[INFO]:------------------------------------------------
20171006:06:19:27:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Deleting distributed backout files
20171006:06:19:27:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Removing back out file
20171006:06:19:27:109203 gpinitsystem:vm11:gpadmin-[INFO]:-No errors generated from parallel processes
20171006:06:19:27:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Restarting the Greenplum instance in production mode
20171006:06:19:27:122973 gpstop:vm11:gpadmin-[INFO]:-Starting gpstop with args: -a -l /home/gpadmin/gpAdminLogs -i -m -d /data/master/gpseg-1
20171006:06:19:27:122973 gpstop:vm11:gpadmin-[INFO]:-Gathering information and validating the environment...
20171006:06:19:27:122973 gpstop:vm11:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20171006:06:19:27:122973 gpstop:vm11:gpadmin-[INFO]:-Obtaining Segment details from master...
20171006:06:19:27:122973 gpstop:vm11:gpadmin-[INFO]:-Greenplum Version: 'postgres (Greenplum Database) 4.3.16.1 build 1'
20171006:06:19:27:122973 gpstop:vm11:gpadmin-[INFO]:-There are 0 connections to the database
20171006:06:19:27:122973 gpstop:vm11:gpadmin-[INFO]:-Commencing Master instance shutdown with mode='immediate'
20171006:06:19:27:122973 gpstop:vm11:gpadmin-[INFO]:-Master host=vm11
20171006:06:19:27:122973 gpstop:vm11:gpadmin-[INFO]:-Commencing Master instance shutdown with mode=immediate
20171006:06:19:27:122973 gpstop:vm11:gpadmin-[INFO]:-Master segment instance directory=/data/master/gpseg-1
20171006:06:19:28:122973 gpstop:vm11:gpadmin-[INFO]:-Attempting forceful termination of any leftover master process
20171006:06:19:28:122973 gpstop:vm11:gpadmin-[INFO]:-Terminating processes for segment /data/master/gpseg-1
20171006:06:19:29:123060 gpstart:vm11:gpadmin-[INFO]:-Starting gpstart with args: -a -l /home/gpadmin/gpAdminLogs -d /data/master/gpseg-1
20171006:06:19:29:123060 gpstart:vm11:gpadmin-[INFO]:-Gathering information and validating the environment...
20171006:06:19:29:123060 gpstart:vm11:gpadmin-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 4.3.16.1 build 1'
20171006:06:19:29:123060 gpstart:vm11:gpadmin-[INFO]:-Greenplum Catalog Version: '201310150'
20171006:06:19:29:123060 gpstart:vm11:gpadmin-[INFO]:-Starting Master instance in admin mode
20171006:06:19:30:123060 gpstart:vm11:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20171006:06:19:30:123060 gpstart:vm11:gpadmin-[INFO]:-Obtaining Segment details from master...
20171006:06:19:30:123060 gpstart:vm11:gpadmin-[INFO]:-Setting new master era
20171006:06:19:30:123060 gpstart:vm11:gpadmin-[INFO]:-Master Started...
20171006:06:19:30:123060 gpstart:vm11:gpadmin-[INFO]:-Shutting down master
20171006:06:19:31:123060 gpstart:vm11:gpadmin-[INFO]:-Commencing parallel segment instance startup, please wait...
........ 
20171006:06:19:39:123060 gpstart:vm11:gpadmin-[INFO]:-Process results...
20171006:06:19:39:123060 gpstart:vm11:gpadmin-[INFO]:-----------------------------------------------------
20171006:06:19:39:123060 gpstart:vm11:gpadmin-[INFO]:-   Successful segment starts                                            = 3
20171006:06:19:39:123060 gpstart:vm11:gpadmin-[INFO]:-   Failed segment starts                                                = 0
20171006:06:19:39:123060 gpstart:vm11:gpadmin-[INFO]:-   Skipped segment starts (segments are marked down in configuration)   = 0
20171006:06:19:39:123060 gpstart:vm11:gpadmin-[INFO]:-----------------------------------------------------
20171006:06:19:39:123060 gpstart:vm11:gpadmin-[INFO]:-
20171006:06:19:39:123060 gpstart:vm11:gpadmin-[INFO]:-Successfully started 3 of 3 segment instances 
20171006:06:19:39:123060 gpstart:vm11:gpadmin-[INFO]:-----------------------------------------------------
20171006:06:19:39:123060 gpstart:vm11:gpadmin-[INFO]:-Starting Master instance vm11 directory /data/master/gpseg-1 
20171006:06:19:51:123060 gpstart:vm11:gpadmin-[INFO]:-Command pg_ctl reports Master vm11 instance active
20171006:06:19:53:123060 gpstart:vm11:gpadmin-[INFO]:-No standby master configured.  skipping...
20171006:06:19:53:123060 gpstart:vm11:gpadmin-[INFO]:-Database successfully started
20171006:06:19:58:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Completed restart of Greenplum instance in production mode
20171006:06:19:58:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Loading gp_toolkit...
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Scanning utility log file for any warning messages
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[WARN]:-*******************************************************
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[WARN]:-Scan of log file indicates that some warnings or errors
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[WARN]:-were generated during the array creation
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Please review contents of log file
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-/home/gpadmin/gpAdminLogs/gpinitsystem_20171006.log
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-To determine level of criticality
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-These messages could be from a previous run of the utility
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-that was called today!
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[WARN]:-*******************************************************
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Greenplum Database instance successfully created
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-------------------------------------------------------
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-To complete the environment configuration, please 
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-update gpadmin .bashrc file with the following
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-1. Ensure that the greenplum_path.sh file is sourced
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-2. Add "export MASTER_DATA_DIRECTORY=/data/master/gpseg-1"
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-   to access the Greenplum scripts for this instance:
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-   or, use -d /data/master/gpseg-1 option for the Greenplum scripts
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:-   Example gpstate -d /data/master/gpseg-1
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Script log file = /home/gpadmin/gpAdminLogs/gpinitsystem_20171006.log
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:-To remove instance, run gpdeletesystem utility
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:-To initialize a Standby Master Segment for this Greenplum instance
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Review options for gpinitstandby
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:-------------------------------------------------------
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:-The Master /data/master/gpseg-1/pg_hba.conf post gpinitsystem
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:-has been configured to allow all hosts within this new
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:-array to intercommunicate. Any hosts external to this
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:-new array must be explicitly added to this file
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Refer to the Greenplum Admin support guide which is
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:-located in the /usr/local/greenplum-db/./docs directory
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:-------------------------------------------------------

    出错过好几次,总算搞定了. gpstop出错,目录存在,系统已经在运行...等等.

     ps -ef | grep gpadmin

[gpadmin@vm11 ~]$ kill -9 101840  
[gpadmin@vm11 ~]$  rm /tmp/.s.PGSQL.5432.lock


rm -rf /data/master/gpseg-1
rm -rf /data/primary/gpseg1
rm -rf /data/primary/gpseg0
rm -rf /data/primary/gpseg1
rm -rf /data/primary/gpseg2

    6, 运行SQL

[gpadmin@vm11 conf]$ psql -p 5432 -d gpexmp
psql (8.2.15)
Type "help" for help.


create table t1(id int primary key,cn int,name varchar(40)) distributed by (id);
insert into t1 select generate_series(1,100),generate_series(1,100),generate_series(1,100);


select gp_segment_id,count(1) from t1 group by 1;
 gp_segment_id | count 
---------------+-------
             2 |    37
             1 |    32
             0 |    31

看到效果了,成功了

猜你喜欢

转载自blog.csdn.net/hugolyl/article/details/78167450