在CentOS7.2上部署Postgres-XL分布式数据库

版权声明:本文为博主原创文章,转载请注明【转载自皓月如我的CSDN博客】,并贴出原始链接地址。 https://blog.csdn.net/fm0517/article/details/81329309

1. 下载安装包

移步这里下载:https://download.csdn.net/download/fm0517/10577703

2. 节点分类及说明

【GTM】
全局事务控制节点,保证集群数据的一致性,与Coordinator节点和Datanode节点不断通信,是整个集群的核心节点,只存在一个,可以存在一个GTM Standby节点,对GTM实时备份。GTM一旦故障,整个集群立刻无法访问,此时可以切换到GTM Standby节点上。如果部署了GTM Standby节点,就应该同时部署GTM Proxy,一般和Coordinator、Datanode部署在同一台服务器上。GTM Proxy的作用代理Coordinator和Datanode对GTM的访问,起到减轻GTM负载的作用,另外一个重要的作用是帮助完成GTM的故障切换,当GTM节点发生故障后,GTM Standby成为新的GTM,此时Coordinator和Datanode节点并不需要重新指定GTM地址,只需要GTM Proxy重新连接到新的GTM地址即可。

【Coordinator】
接收数据访问请求的节点,本质上是由PG后台进程组成。接收的一条查询后,Coordinator节点执行查询计划,然后会根据查询数据涉及的数据节点将查询分发给相关的数据节点。写入数据时,也会根据不同的数据分布策略将数据写入相关的节点。可以说Coordinator节点上保存着集群的全局数据位置。Coordinator节点可以任意扩展,各个节点之间除了访问地址不同以外是完全对等的,通过一个节点更新的数据可以在另一个节点上立刻看到。每个Coordinator节点可以配置一个对应的standby节点,避免单点故障。

【Datanode】
实际存取数据的节点,接收Coordinator的请求并执行SQL语句存取数据,节点之间也会互相通信。一般的,一个节点上的数据并不是全局的,数据节点不直接对外提供数据访问。一个表的数据在数据节点上的分布存在两种模式:复制模式和分片模式,复制模式下,一个表的数据在指定的节点上存在多个副本;分片模式下,一个表的数据按照一定的规则分布在多个数据节点上,这些节点共同保存一份完整的数据。这两种模式的选择是在创建表的时候执行CREATE TABLE语句指定的,具体语法如下:

CREATE TABLE table_name(...)
DISTRIBUTE BY 
HASH(col)|MODULO(col)|ROUNDROBIN|REPLICATION
TO NODE(nodename1,nodename2...)

可以看到,如果DISTRIBUTE BY 后面是REPLICATION,则是复制模式,其余则是分片模式,HASH指的是按照指定列的哈希值分布数据,MODULO指的是按照指定列的取摩运算分布数据,ROUNDROBIN指的是按照轮询的方式分布数据。TO NODE指定了数据分布的节点范围,如果没有指定则默认所有数据节点参与数据分布。如果没有指定分布模式,即使用普通的CREATE TABLE语句,PGXL会默认采用分片模式将数据分布到所有数据节点。

3. 主机规划

GTM: server0: 192.168.51.140
Coordinator: server1: 192.168.51.141
Datanode1 Master: server2: 192.168.51.142
Datanode2 Master: server3: 192.168.51.143
Datanode1 Slave: server4: 192.168.51.144
Datanode2 Slave: server5: 192.168.51.145

—–Datanode3用于后面演示动态增删节点—–
Datanode3 Master: server6: 192.168.51.146

【下面的步骤4-13,在所有节点上都要执行】

4. 修改主机hosts文件

# vi /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.51.140    server0
192.168.51.141    server1
192.168.51.142    server2
192.168.51.143    server3
192.168.51.144    server4
192.168.51.145    server5
192.168.51.146    server6

5. 安装依赖的软件包

openssh-clients, flex, bison, readline-devel, zlib-devel, openjade, docbook-style-dsssl, gcc
用rpm命令查看是否已安装

# rpm -qa |grep xxx

如果没装可以用yum安装

6. 卸载操作系统自带的PostgreSQL

# rpm -qa |grep postgresql
# rpm -qa |grep postgresql | xargs rpm -e --nodeps
# rpm -qa |grep postgresql

7. 添加postgres用户

建立组:
# groupadd  postgres
建立用户:
# useradd  -m  -d   /home/postgres   postgres  -g  postgres
初始化密码:
# passwd postgres
输入密码:12345678(举例)
注:如果需要删除postgres用户,可以以root用户执行命令:
# userdel -r postgres

8. 配置免密登陆

root用户

# ssh-keygen
# ssh-copy-id -i ~/.ssh/id_rsa.pub root@server0
# ssh-copy-id -i ~/.ssh/id_rsa.pub root@server1
# ssh-copy-id -i ~/.ssh/id_rsa.pub root@server2
# ssh-copy-id -i ~/.ssh/id_rsa.pub root@server3
# ssh-copy-id -i ~/.ssh/id_rsa.pub root@server4
# ssh-copy-id -i ~/.ssh/id_rsa.pub root@server5
# ssh-copy-id -i ~/.ssh/id_rsa.pub root@server6

postgres用户

$ ssh-keygen
$ ssh-copy-id -i ~/.ssh/id_rsa.pub postgres@server0
$ ssh-copy-id -i ~/.ssh/id_rsa.pub postgres@server1
$ ssh-copy-id -i ~/.ssh/id_rsa.pub postgres@server2
$ ssh-copy-id -i ~/.ssh/id_rsa.pub postgres@server3
$ ssh-copy-id -i ~/.ssh/id_rsa.pub postgres@server4
$ ssh-copy-id -i ~/.ssh/id_rsa.pub postgres@server5
$ ssh-copy-id -i ~/.ssh/id_rsa.pub postgres@server6

如果配置完成后,发现ssh并不免密,可按下面操作进行完善:

chmod 755 /home/postgres
chmod 700 /home/postgres/.ssh
chmod 644 /home/postgres/.ssh/authorized_keys

9. 修改内核参数

在/etc/sysctl.conf中添加参数kernel.sem

# vi /etc/sysctl.conf
kernel.sem = 50100 128256000 50100 2560

使参数生效

# sysctl -p

验证参数是否生效。

# ipcs -ls

—— Semaphore Limits ——–
max number of arrays = 2560
max semaphores per array = 50100
max semaphores system wide = 128256000
max ops per semop call = 50100
semaphore max value = 32767

10. 配置防火墙

关闭防火墙

# systemctl stop firewalld
# systemctl disable firewalld

或在防火墙中开放端口
遇到防护墙不能关闭的情况,可使用该步骤!
具体需要开放的端口参加(主机规划部分),以5432为例:
使用这些命令来永久打开一个新端口(如TCP/5432)。

# sudo firewall-cmd --zone=public --add-port=5432/tcp --permanent
# sudo firewall-cmd --reload

注:查看防火墙状态命令:systemctl status firewalld
注:开启防火墙:systemctl start firewalld
注:关闭防火墙:systemctl stop firewalld

11. 关闭SELinux

查看SELinux状态,执行:

# /usr/sbin/sestatus -v  

如果SELinux status参数为enabled即为开启状态
永久关闭SELinux,执行:

# vi /etc/selinux/config

将 SELINUX=enforcing 改为 SELINUX=disabled
重启后生效

12. 安装Postgres-XL软件

解压

# tar -jxvf postgres-xl-9.5r1.6.tar.bz2
# chown -R postgres:postgres postgres-xl-9.5r1.6

切换用户

# su - postgres

安装

$ cd postgres-xl-9.5r1.6
$ ./configure --prefix=/home/postgres/pgxl9.5
$ make
$ make install

安装扩展

$ cd contrib
$ make
$ make install

13. 配置环境变量

在文件.bash_profile中添加如下内容:

# su - postgres
$ vi .bash_profile

export PGHOME=/home/postgres/pgxl9.5
export PGUSER=postgres
export LD_LIBRARY_PATH=$PGHOME/lib:$LD_LIBRARY_PATH
export PATH=$PGHOME/bin:$PATH

使环境变量生效:

$ source ~/.bashrc

验证环境变量是否生效:

pg_ctl --version 

显示:pg_ctl (PostgreSQL) 9.5.8 (Postgres-XL 9.5r1.6)

14. 配置Postgres-XL集群

在GTM节点上以postgres用户执行:

$ pgxc_ctl              ---初次执行,会提示Error说没有配置文件,忽略即可
PGXC prepare            ---执行该命令将会生成一份配置文件模板
PGXC exit               --退出 pgxc_ctl交互窗

执行完成后,在postgres用户根目录下,会生成一个pgxc_ctl目录,编辑其中的pgxc_ctl.conf文件

$ vi pgxc_ctl.conf

修改为:

#!/usr/bin/env bash

# pgxcInstallDir variable is needed if you invoke "deploy" command from pgxc_ctl utility.
# If don't you don't need this variable.
pgxcInstallDir=$PGHOME
pgxlDATA=$PGHOME/data
#---- OVERALL -----------------------------------------------------------------------------
#
pgxcOwner=postgres      # owner of the Postgres-XC databaseo cluster.  Here, we use this
                        # both as linus user and database user.  This must be
                        # the super user of each coordinator and datanode.
pgxcUser=$pgxcOwner        # OS user of Postgres-XC owner

tmpDir=/tmp                 # temporary dir used in XC servers
localTmpDir=$tmpDir            # temporary dir used here locally

configBackup=n                  # If you want config file backup, specify y to this value.
configBackupHost=pgxc-linker    # host to backup config file
configBackupDir=$HOME/pgxc     # Backup directory
configBackupFile=pgxc_ctl.bak   # Backup file name --> Need to synchronize when original changed.

#---- GTM ------------------------------------------------------------------------------------

#---- GTM Master -----------------------------------------------

#---- Overall ----
gtmName=gtm
gtmMasterServer=server0
gtmMasterPort=6666
gtmMasterDir=$pgxlDATA/nodes/gtm

#---- Configuration ---
gtmExtraConfig=none         # Will be added gtm.conf for both Master and Slave (done at initilization only)
gtmMasterSpecificExtraConfig=none   # Will be added to Master's gtm.conf (done at initialization only)

#---- Coordinators ----------------------------------------------------------------------------------------------------

#---- shortcuts ----------
coordMasterDir=$pgxlDATA/nodes/coord
coordSlaveDir=$pgxlDATA/nodes/coord_slave
coordArchLogDir=$pgxlDATA/nodes/coord_archlog

#---- Overall ------------
coordNames=(coord1)     # Master and slave use the same name
coordPorts=(5432)           # Master ports
poolerPorts=(20004)         # Master pooler ports
coordPgHbaEntries=(0.0.0.0/0)               # Assumes that all the coordinator (master/slave) accepts
                                                # the same connection
                                                # This entry allows only $pgxcOwner to connect.
                                                # If you'd like to setup another connection, you should
                                                # supply these entries through files specified below.

#---- Master -------------
coordMasterServers=server1      # none means this master is not available
coordMasterDirs=$coordMasterDir
coordMaxWALsernder=10   # max_wal_senders: needed to configure slave. If zero value is specified,
                        # it is expected to supply this parameter explicitly by external files
                        # specified in the following.   If you don't configure slaves, leave this value to zero.
coordMaxWALSenders=$coordMaxWALsernder
                        # max_wal_senders configuration for each coordinator.

#---- Configuration files---
coordExtraConfig=coordExtraConfig   # Extra configuration file for coordinators.  
                        # This file will be added to all the coordinators'
                        # postgresql.conf
# Pleae note that the following sets up minimum parameters which you may want to change.
# You can put your postgresql.conf lines here.
cat > $coordExtraConfig <<EOF
#================================================
# Added to all the coordinator postgresql.conf
# Original: $coordExtraConfig
log_destination = 'stderr'
logging_collector = on
log_directory = 'pg_log'
listen_addresses = '*'
max_connections = 512
EOF

# Additional Configuration file for specific coordinator master.
# You can define each setting by similar means as above.
coordSpecificExtraConfig=(none none)
coordExtraPgHba=none    # Extra entry for pg_hba.conf.  This file will be added to all the coordinators' pg_hba.conf
coordSpecificExtraPgHba=(none none)

#---- Datanodes -------------------------------------------------------------------------------------------------------

#---- Shortcuts --------------
datanodeMasterDir=$pgxlDATA/nodes/dn_master
datanodeSlaveDir=$pgxlDATA/nodes/dn_slave
datanodeArchLogDir=$pgxlDATA/nodes/datanode_archlog

#---- Overall ---------------
#primaryDatanode=datanode1              # Primary Node.
# At present, xc has a priblem to issue ALTER NODE against the primay node.  Until it is fixed, the test will be done
# without this feature.
primaryDatanode=datanode1               # Primary Node.
datanodeNames=(datanode1 datanode2)
datanodePorts=(5433 5433)   # Master ports
datanodePoolerPorts=(20005 20005)   # Master pooler ports
datanodePgHbaEntries=(0.0.0.0/0)    # Assumes that all the coordinator (master/slave) accepts
                                        # the same connection
                                        # This list sets up pg_hba.conf for $pgxcOwner user.
                                        # If you'd like to setup other entries, supply them
                                        # through extra configuration files specified below.
# Note: The above parameter is extracted as "host all all 0.0.0.0/0 trust".   If you don't want
# such setups, specify the value () to this variable and suplly what you want using datanodeExtraPgHba
# and/or datanodeSpecificExtraPgHba variables.
#datanodePgHbaEntries=(::1/128) # Same as above but for IPv6 addresses

#---- Master ----------------
datanodeMasterServers=(server2 server3) # none means this master is not available.
                                                    # This means that there should be the master but is down.
                                                    # The cluster is not operational until the master is
                                                    # recovered and ready to run.   
datanodeMasterDirs=($datanodeMasterDir/dn1 $datanodeMasterDir/dn2)
datanodeMaxWalSender=10                             # max_wal_senders: needed to configure slave. If zero value is 
                                                    # specified, it is expected this parameter is explicitly supplied
                                                    # by external configuration files.
                                                    # If you don't configure slaves, leave this value zero.
datanodeMaxWALSenders=($datanodeMaxWalSender $datanodeMaxWalSender)
                        # max_wal_senders configuration for each datanode

#---- Slave -----------------
datanodeSlave=y         # Specify y if you configure at least one coordiantor slave.  Otherwise, the following
                        # configuration parameters will be set to empty values.
                        # If no effective server names are found (that is, every servers are specified as none),
                        # then datanodeSlave value will be set to n and all the following values will be set to
                        # empty values.
datanodeSlaveServers=(server4 server5)  # value none means this slave is not available
datanodeSlavePorts=(15433 15433)    # value none means this slave is not available
datanodeSlavePoolerPorts=(20015 20015)  # value none means this slave is not available
datanodeSlaveSync=y     # If datanode slave is connected in synchronized mode
datanodeSlaveDirs=($datanodeSlaveDir $datanodeSlaveDir)
datanodeArchLogDirs=( $datanodeArchLogDir $datanodeArchLogDir)

# ---- Configuration files ---
# You may supply your bash script to setup extra config lines and extra pg_hba.conf entries here.
# These files will go to corresponding files for the master.
# Or you may supply these files manually.
datanodeExtraConfig=none    # Extra configuration file for datanodes.  This file will be added to all the 
                            # datanodes' postgresql.conf
datanodeSpecificExtraConfig=(none none)
datanodeExtraPgHba=none     # Extra entry for pg_hba.conf.  This file will be added to all the datanodes' postgresql.conf
datanodeSpecificExtraPgHba=(none none)

15.初始化集群

在GTM节点上以postgres用户执行:
初始化集群:

$ pgxc_ctl  -c  ~/pgxc_ctl/pgxc_ctl.conf init all

启动集群:

$ pgxc_ctl  -c /home/postgres/pgxc_ctl/pgxc_ctl.conf start all

停止集群

$ pgxc_ctl -c /home/postgres/pgxc_ctl/pgxc_ctl.conf stop all

16.删除集群

在所有节点上以postgres用户执行:

$ rm /home/postgres/pgxl9.5/data/nodes

猜你喜欢

转载自blog.csdn.net/fm0517/article/details/81329309