Greenplum 5.21.1 cluster installation deployment detail

In simple terms GPDB is a distributed database software, which can manage and process massive amounts of data spread across multiple different hosts. For GPDB, the example is actually a DB by a plurality of independent instances PostgreSQL, which are distributed on different physical host, work together, is presented to the user in a DB effect. Master is GPDB access entry system, which is responsible for handling client connections and SQL commands, coordinate systems other Instance (Segment) work, Segment is responsible for managing and processing of user data.

First, the user operates the Master Host Root

1. Modify the / etc / hosts file, add the following contents (Note: the same configuration servers 4)

vim /etc/hosts

192.168.18.130 gp-master
192.168.18.131 gp-standby
192.168.18.132 gp-node1
192.168.18.133 gp-node2

2. The server closed selinux, firewall 4 servers open to each other, the test environment can directly turn off the firewall. (Note: the same configuration servers 4)

Close Firewalld

systemctl stop firewalld
systemctl disable firewalld

Permanently closed Selinux

vim /etc/selinux/conf

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of three two values:
#     targeted - Targeted processes are protected,
#     minimum - Modification of targeted policy. Only selected processes are protected. 
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted 

NOTE: See Selinux running state: getenforce, CLI interface is provided a non-permanent SeLinux: setenforce 0 (0-1 corresponding to the closed and open)

3. Operating system parameter settings

 vim /etc/sysctl.conf (Note: the same configuration servers 4)

kernel.shmmax = 500000000
kernel.shmmni = 4096
kernel.shmall = 4000000000
kernel.sem = 250 512000 100 2048
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
net.ipv4.tcp_syncookies = 1
net.ipv4.ip_forward = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.ipv4.conf.default.arp_filter = 1
net.core.netdev_max_backlog = 10000
vm.overcommit_memory = 2
kernel.msgmni = 2048
net.ipv4.ip_local_port_range = 1025 65535

 vim /etc/security/limits.conf (Note: the same configuration servers 4)

* Soft nofile 65536 
* hard nofile 65536 
* soft nproc 131072 
* hard nproc 131072

Disk read-ahead algorithm parameters and deadline modification (Note: the same configuration servers 4)

blockdev --setra 65536 /dev/sda
echo deadline > /sys/block/sda/queue/scheduler

Note: The drive letter sda need to be configured according to their own situation

软件下载地址:https://network.pivotal.io/products/pivotal-gpdb,下载:greenplum-db-5.21.1-rhel7-x86_64.rpm

在Master主机上安装GP二进制文件,也就是主机名是mdw的服务器。(注:在master上安装即可,后面通过批量的方法安装剩下的服务器)

rpm -ivh greenplum-db-5.21.1-rhel7-x86_64.rpm

注:默认安装目录:/usr/local

 在Master上添加gpadmin用户

adduser gpadmin
echo gpadmin | passwd --stdin gpadmin

 注:设置密码为了后面gpssh-exkeys -f hostfile_allhosts 使用

 在Master上给gpadmin用户提权

[root@gp-master ~]# visudo  
gpadmin    ALL=(ALL)       ALL
gpadmin    ALL=(ALL)       NOPASSWD:ALL

在Master主机上赋予gpadmin用户Greenplum文件夹的的权限

chown -R gpadmin.gpadmin /usr/local/greenplum-db*

 二、Master 主机 Gpadmin用户上操作

准备用于批量安装软件以及后续集群的初始化文件,hostfile_allhosts,hostfile_segments,hostfile_mshosts,存放到/home/gpadmin

su - gpadmin

vim hostfile_allhosts

gp-master
gp-standby
gp-node1
gp-node2

vim hostfile_segments

gp-node1
gp-node2

vim hostfile_mshosts

gp-master
gp-standby

设置各主机之间免密登录

gpssh-exkeys -f hostfile_allhosts

注:需输入gpadmin用户的密码,此处为:gpadmin

设置用于安装Greenplum的文件夹权限

gpssh -f hostfile_allhosts
=> sudo chown gpadmin.gpadmin /usr/local
=> exit

创建及赋权master/standby主机元数据存储目录

gpssh -f hostfile_mshosts
=>sudo mkdir /data/greenplum_data/gpmaster
=>sudo chown -R gpadmin.gpadmin /data
=>exit

创建及赋权Segments主机数据存储目录

gpssh -f hostfile_segments
=>sudo mkdir /data/greenplum_data/{primary,mirror}
=>sudo chown -R gpadmin.gpadmin /data
=>exit

批量安装软件(GP)

cd /home/gpadmin/
source /usr/local/greenplum-db/greenplum_path.sh
gpseginstall -f hostfile_allhosts -u gpadmin -p gpadmin

设置NTP同步


 Yum下载安装NTP服务器,已安装的可以略过

sudo yum install ntp -y

若出现如下报错,可看下一步解决方法

There was a problem importing one of the  modules 
required to run yum. The error leading to this problem was: 
  
  No module named yum 
  
Please install a package which provides this module, or 
verify that the module is installed correctly. 
  
It's possible that the above module doesn't match the 
current version of Python, which is: 
2.7.13 (r266:84292, Jan 22 2014, 09:37:14)  
[GCC 4.4.7 20120313 ( 4.4.7-4)] 
  
If you cannot solve this problem yourself, please go to  
the yum faq at: 
  http://yum.baseurl.org/wiki/Faq
View Code

解决方法:

unset PYTHONHOME
unset PYTHONPATH
unset LD_LIBRARY_PATH

 再进行yum安装之后,再修改回来,使得GP能正常使用

source /usr/local/greenplum-db/greenplum_path.sh

注:报错原因:在安装GP集群之后,会在master节点中的环境变量中会增加 PYTHONHOME,PYTHONPATH,LD_LIBRARY_PATH几项,并且会修改原本的path。

补充:LD_LIBRARY_PATH 该环境变量主要用于指定查找共享库(动态链接库)时除了默认路径之外的其他路径。


在每个Segment主机,编辑/etc/ntp.conf文件。设置第一个server参数指向Master主机,第二个server参数指向Standby主机。如下面:

sudo vim /etc/ntp.conf

server gp-master prefer
server gp-standby

在Standby主机,编辑/etc/ntp.conf文件。设置第一个server参数指向Master主机,第二个参数指向数据中心的时间服务器。

sudo vim /etc/ntp.conf

server gp-master prefer

在Master主机,使用NTP守护进程同步所有Segment主机的系统时钟。例如,使用gpssh来完成:

gpssh -f hostfile_allhosts -v -e 'ntpd'

 输出如下代表成功:

[root@gp-master gpadmin]# gpssh -f all_hosts -v -e 'ntpd'
[WARN] Reference default values as $MASTER_DATA_DIRECTORY/gpssh.conf could not be found
Using delaybeforesend 0.05 and prompt_validation_timeout 1.0

[Reset ...]
[INFO] login mdw
[INFO] login smdw
[INFO] login sdw1
[INFO] login sdw2
[ mdw] ntpd
[smdw] ntpd
[sdw1] ntpd
[sdw2] ntpd
[INFO] completed successfully

[Cleanup...]

配置Greenplum初始化文件

cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config   /home/gpadmin/gpinitsystem_config
chmod 775 gpinitsystem_config

 相关配置如下:

[gpadmin@gp-master ~]$ cat gpinitsystem_config 
# FILE NAME: gpinitsystem_config

# Configuration file needed by the gpinitsystem

################################################
#### REQUIRED PARAMETERS
################################################

#### Name of this Greenplum system enclosed in quotes.
ARRAY_NAME="Greenplum Data Platform"

#### Naming convention for utility-generated data directories.
SEG_PREFIX=gpseg

#### Base number by which primary segment port numbers 
#### are calculated.
PORT_BASE=40000

#### File system location(s) where primary segment data directories 
#### will be created. The number of locations in the list dictate
#### the number of primary segments that will get created per
#### physical host (if multiple addresses for a host are listed in 
#### the hostfile, the number of segments will be spread evenly across
#### the specified interface addresses).
declare -a DATA_DIRECTORY=(/data/greenplum_data/primary)

#### OS-configured hostname or IP address of the master host.
MASTER_HOSTNAME=k8s-master

#### File system location where the master data directory 
#### will be created.
MASTER_DIRECTORY=/data/greenplum_data/gpmaster

#### Port number for the master instance. 
MASTER_PORT=5432

#### Shell utility used to connect to remote hosts.
TRUSTED_SHELL=ssh

#### Maximum log file segments between automatic WAL checkpoints.
CHECK_POINT_SEGMENTS=8

#### Default server-side character set encoding.
ENCODING=UTF-8

################################################
#### OPTIONAL MIRROR PARAMETERS
################################################

#### Base number by which mirror segment port numbers 
#### are calculated.
MIRROR_PORT_BASE=43000

#### Base number by which primary file replication port 
#### numbers are calculated.
REPLICATION_PORT_BASE=34000

#### Base number by which mirror file replication port 
#### numbers are calculated. 
MIRROR_REPLICATION_PORT_BASE=44000

#### File system location(s) where mirror segment data directories 
#### will be created. The number of mirror locations must equal the
#### number of primary locations as specified in the 
#### DATA_DIRECTORY parameter.
declare -a MIRROR_DATA_DIRECTORY=(/data/greenplum_data/mirror)


################################################
#### OTHER OPTIONAL PARAMETERS
################################################

#### Create a database of this name after initialization.
DATABASE_NAME=testDB

#### Specify the location of the host address file here instead of
#### with the the -h option of gpinitsystem.
MACHINE_LIST_FILE=/home/gpadmin/hostfile_segments
View Code

 运行初始化工具初始化数据库

source /usr/local/greenplum-db/greenplum_path.sh 
gpinitsystem -c gpinitsystem_config

初始化日志:

20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-Review options for gpinitstandby
20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-------------------------------------------------------
20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-The Master /data/master/gpseg-1/pg_hba.conf post gpinitsystem
20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-has been configured to allow all hosts within this new
20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-array to intercommunicate. Any hosts external to this
20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-new array must be explicitly added to this file
20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-Refer to the Greenplum Admin support guide which is
20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-located in the /usr/local/greenplum-db/./docs directory
20160827:16:23:11:002458 gpinitsystem:mdw:gpadmin-[INFO]:-------------------------------------------------------

现在只有1个master,2个segment,没有standby,那么接下来把standby加入集群。

在Master服务器上执行

gpinitstandby -s gp-standby

输出如下:

[gpadmin@mdw ~]$ gpinitstandby -s smdw
20160827:16:59:24:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Validating environment and parameters for standby initialization...
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Checking for filespace directory /data/master/gpseg-1 on smdw
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:------------------------------------------------------
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum standby master initialization parameters
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:------------------------------------------------------
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum master hostname               = mdw
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum master data directory         = /data/master/gpseg-1
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum master port                   = 5432
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum standby master hostname       = smdw
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum standby master port           = 5432
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum standby master data directory = /data/master/gpseg-1
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Greenplum update system catalog         = On
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:------------------------------------------------------
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:- Filespace locations
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:------------------------------------------------------
20160827:16:59:25:023346 gpinitstandby:mdw:gpadmin-[INFO]:-pg_system -> /data/master/gpseg-1
Do you want to continue with standby master initialization? Yy|Nn (default=N):
> y
20160827:16:59:31:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Syncing Greenplum Database extensions to standby
20160827:16:59:31:023346 gpinitstandby:mdw:gpadmin-[INFO]:-The packages on smdw are consistent.
20160827:16:59:31:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Adding standby master to catalog...
20160827:16:59:31:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Database catalog updated successfully.
20160827:16:59:31:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Updating pg_hba.conf file...
20160827:16:59:37:023346 gpinitstandby:mdw:gpadmin-[INFO]:-pg_hba.conf files updated successfully.
20160827:16:59:39:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Updating filespace flat files...
20160827:16:59:39:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Filespace flat file updated successfully.
20160827:16:59:39:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Starting standby master
20160827:16:59:39:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Checking if standby master is running on host: smdw  in directory: /data/master/gpseg-1
20160827:16:59:40:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Cleaning up pg_hba.conf backup files...
20160827:16:59:46:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Backup files of pg_hba.conf cleaned up successfully.
20160827:16:59:46:023346 gpinitstandby:mdw:gpadmin-[INFO]:-Successfully created standby master on gp-standby
View Code

查看启动进程:

[gpadmin@gp-master ~]$ ps -ef | grep postgres
gpadmin  10975     1  0 00:57 ?        00:00:00 /usr/local/greenplum-db-5.21.1/bin/postgres -D /data/greenplum_data/gpmaster/gpseg-1 -p 5432 --gp_dbid=1 --gp_num_contents_in_cluster=2 --silent-mode=true -i -M master --gp_contentid=-1 -x 0 -E
gpadmin  10976 10975  0 00:57 ?        00:00:00 postgres:  5432, master logger process   
gpadmin  10979 10975  0 00:57 ?        00:00:00 postgres:  5432, stats collector process   
gpadmin  10980 10975  0 00:57 ?        00:00:01 postgres:  5432, writer process   
gpadmin  10981 10975  0 00:57 ?        00:00:00 postgres:  5432, checkpointer process   
gpadmin  10982 10975  0 00:57 ?        00:00:00 postgres:  5432, seqserver process   
gpadmin  10983 10975  0 00:57 ?        00:00:00 postgres:  5432, ftsprobe process   
gpadmin  10984 10975  0 00:57 ?        00:00:00 postgres:  5432, sweeper process   
gpadmin  10985 10975  0 00:57 ?        00:00:05 postgres:  5432, stats sender process   
gpadmin  10986 10975  0 00:57 ?        00:00:01 postgres:  5432, wal writer process   
gpadmin  11279 10975  0 00:59 ?        00:00:00 postgres:  5432, wal sender process gpadmin 192.168.18.131(53573) streaming 0/C05A028
gpadmin  16800 16608  0 04:15 pts/0    00:00:00 grep --color=auto postgres

设置gpadmin用户环境变量,Master,Standby都需设置。 

vim /home/gpadmin/.bashrc

[gpadmin@gp-master ~]$ cat .bashrc 
# .bashrc

# Source global definitions
if [ -f /etc/bashrc ]; then
        . /etc/bashrc
fi

# Uncomment the following line if you don't like systemctl's auto-paging feature:
# export SYSTEMD_PAGER=

# User specific aliases and functions

source /usr/local/greenplum-db/greenplum_path.sh 

export MASTER_DATA_DIRECTORY=/data/greenplum_data/gpmaster/gpseg-1
export PGPRORT=5432
export PGDATABASE=testDB
[gpadmin@gp-master ~]$ scp .bashrc gp-standby:`pwd`

启动和停止数据库测试是否能正常启动和关闭,命令如下

gpstart
gpstop 

到此 Greenplum 就部署完成了。下面进行一些简单的测试。

登录数据库:psql -d postgres

建表,插入,查询

postgres=# create table student ( no int primary key,student_name varchar(40),age int);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "student_pkey" for table "student"
CREATE TABLE
postgres=# insert into student values(1,'yayun',18);
INSERT 0 1
postgres=# select * from student;
 no | student_name | age 
----+--------------+-----
  1 | yayun        |  18
(1 row)

Guess you like

Origin www.linuxidc.com/Linux/2019-08/160086.htm