【Offline Data Warehouse Project from 0】——Environmental Construction of Data Warehouse (1)

Table of contents

1. Server environment preparation

1.2 Write the cluster distribution script xsync

1.3 SSH passwordless login configuration

1.4 JDK preparation

1.5 Environment Variable Configuration Instructions

2. View scripts for all processes in the cluster

3. Zookeeper installation

3.1 Distributed installation and deployment

3.2 ZK cluster start and stop script

3.3 Client command line operation


1. Server environment preparation

How to switch from command line mode to GUI mode in CentOS 7

Switch to root privileges:

sudo su -

  

Prepare three virtual machines. The virtual machine configuration requirements are as follows:

(1) Single virtual machine: memory 4G, hard disk 50G

(2) Modify the static IP of the cloned virtual machine

[root@hadoop102 ~]# vim /etc/sysconfig/network-scripts/ifcfg-ens33

changed to

DEVICE=ens33

TYPE=Ethernet

ONBOOT=yes

BOOTPROTO=static

NAME="ens33"

PREFIX=24

IPADDR=192.168.10.102

GATEWAY=192.168.10.2

DNS1=192.168.10.2

(3) View the virtual network editor of the Linux virtual machine, edit -> virtual network editor -> VMnet8

 

(5) View the IP address of the Windows system adapter VMware Network Adapter VMnet8

(6) Ensure that the IP address in the Linux file, the address of the Linux virtual network editor, and the IP address of the Windows system VM8 network are the same.

 

2 ) Modify the host name

( 1 ) Modify the host name

[root@hadoop102 ~]# hostnamectl --static set-hostname hadoop102

( 2 ) Configure host name mapping, open /etc/hosts

[root@hadoop102 ~]# vim /etc/hosts

Add the following content

192.168.10.102 hadoop102

192.168.10.103 hadoop103

192.168.10.104 hadoop104

( 3 ) Modify the Windows host mapping file ( hosts file)

(a) Enter the C:\Windows\System32\drivers\etc path

(b) Open the hosts file and add the following content

192.168.10.102 hadoop102

192.168.10.103 hadoop103

192.168.10.104 hadoop104

3 ) Turn off and disable firewall

[root@hadoop102 ~]# systemctl stop firewalld

[root@hadoop102 ~]# systemctl disable firewalld

4 ) Configure ordinary users (atguigu ) to have root privileges

[root@hadoop102 ~]# vim /etc/sudoers

Modify the /etc/sudoers file, find the following line ( line 102 ), and add a line under %wheel :

## Allow root to run any commands anywhere

root    ALL=(ALL)     ALL

%wheel  ALL=(ALL)       ALL

atguigu   ALL=(ALL)     NOPASSWD: ALL

5 ) Create a folder under the /opt directory

(1) Create module and software folders in the /opt directory

[root@hadoop102 opt]# mkdir /opt/module /opt/software

(2) Modify the owner of the module and software folders

[root@hadoop102 opt]# chown atguigu:atguigu /opt/module /opt/software

6 ) reboot

[root@hadoop102 module]# reboot

1.2 Write the cluster distribution script xsync

1) xsync cluster distribution script

(1) Requirement: Copy files to the same directory of all nodes in a loop

(2) Demand Analysis

① The original copy of the rsync command:

rsync -av /opt/module root@hadoop103:/opt/

② Expected script:

The name of the file to be synchronized by xsync

③ Note: The scripts stored in the directory /home/atguigu/bin can be executed directly by atguigu users anywhere in the system.

(3) Script implementation

① Create a bin folder under the home directory /home/atguigu

[atguigu@hadoop102 ~]$ mkdir bin

②Create an xsync file in the /home/atguigu/bin directory for global calling

[atguigu@hadoop102 ~]$ cd /home/atguigu/bin

[atguigu@hadoop102 bin]$ vim xsync

Write the following code in this file

#!/bin/bash
#1. 判断参数个数
if [ $# -lt 1 ]
then
echo Not Enough Arguement!
exit;
fi
#2. 遍历集群所有机器
for host in hadoop102 hadoop103 hadoop104
do
echo ====================  $host  ====================
#3. 遍历所有目录,挨个发送
for file in $@
do
#4判断文件是否存在
if [ -e $file ]
then
#5. 获取父目录
pdir=$(cd -P $(dirname $file); pwd)
#6. 获取当前文件的名称
fname=$(basename $file)
ssh $host "mkdir -p $pdir"
rsync -av $pdir/$fname $host:$pdir
else
echo $file does not exists!
fi
done
done

③ Modify the script xsync to have execution permission

[atguigu@hadoop102 bin]$ chmod +x xsync

④ Test script

[atguigu@hadoop102 bin]$ xsync xsync

1.3 SSH passwordless login configuration

Explanation: Only hadoop102 and hadoop103 are configured here without secret login to other hosts; because hadoop102 is not configured with NameNode, hadoop103 is configured with ResourceManager, both require confidential access to other nodes.

(1) Generate public and private keys on hadoop102:

[atguigu@hadoop102 .ssh]$ ssh-keygen -t rsa

Then press (three carriage returns), two files id_rsa (private key), id_rsa.pub (public key) will be generated

(2) Copy the hadoop102 public key to the target machine to log in without password

[atguigu@hadoop102 .ssh]$ ssh-copy-id hadoop102

[atguigu@hadoop102 .ssh]$ ssh-copy-id hadoop103

[atguigu@hadoop102 .ssh]$ ssh-copy-id hadoop104

(3) Generate public and private keys on hadoop103:

[atguigu@hadoop103 .ssh]$ ssh-keygen -t rsa

Then press (three carriage returns), two files id_rsa (private key), id_rsa.pub (public key) will be generated

(4) Copy the hadoop103 public key to the target machine to log in without password

[atguigu@hadoop103 .ssh]$ ssh-copy-id hadoop102

[atguigu@hadoop103 .ssh]$ ssh-copy-id hadoop103

[atguigu@hadoop103 .ssh]$ ssh-copy-id hadoop104

1.4 JDK preparation

1) Uninstall the existing JDK (3 nodes)

[atguigu@hadoop102 opt]# sudo rpm -qa | grep -i java | xargs -n1 sudo rpm -e --nodeps

[atguigu@hadoop103 opt]# sudo rpm -qa | grep -i java | xargs -n1 sudo rpm -e --nodeps

[atguigu@hadoop104 opt]# sudo rpm -qa | grep -i java | xargs -n1 sudo rpm -e --nodeps

(1) rpm -qa: Indicates to query all installed software packages

(2) grep -i: Indicates case insensitive when filtering

(3) xargs -n1: means to get a value of the last execution result at a time

(4) rpm -e --nodeps: means to uninstall the software

2) Use the Xftp tool to import the JDK to the /opt/software folder of hadoop102

Click the icon to open Xftp

 The left window corresponds to the Windows file system, and the right window corresponds to the Linux file system. Find the corresponding directory and drag the JDK to the right window to complete the upload.

3) Check whether the software package is imported successfully in the opt directory under the Linux system

[atguigu@hadoop102 software]# ls /opt/software/

See the following results:

jdk-8u212-linux-x64.tar.gz

4) Unzip the JDK to the /opt/module directory

[atguigu@hadoop102 software]# tar -zxvf jdk-8u212-linux-x64.tar.gz -C /opt/module/

5) Configure JDK environment variables

(1) Create a new /etc/profile.d/my_env.sh file

[atguigu@hadoop102 module]# sudo vim /etc/profile.d/my_env.sh

Add the following content , then save (:wq) and exit

#JAVA_HOME

export JAVA_HOME=/opt/module/jdk1.8.0_212

export PATH=$PATH:$JAVA_HOME/bin

(2) Let environment variables take effect

[atguigu@hadoop102 software]$ source /etc/profile.d/my_env.sh

6) Test whether the JDK is installed successfully

[atguigu@hadoop102 module]# java -version

If you can see the following results, Java is installed normally

java version "1.8.0_212"

7) Distribute the JDK

[atguigu@hadoop102 module]$ xsync /opt/module/jdk1.8.0_212/

8) Distribution environment variable configuration file

[atguigu@hadoop102 module]$ sudo /home/atguigu/bin/xsync /etc/profile.d/my_env.sh

9) Execute source on hadoop103 and hadoop104 respectively

[atguigu@hadoop103 module]$ source /etc/profile.d/my_env.sh

[atguigu@hadoop104 module]$ source /etc/profile.d/my_env.sh

1.5 Environment Variable Configuration Instructions

Linux environment variables can be configured in multiple files, such as /etc/profile, /etc/profile.d/*.sh, ~/.bashrc, ~/.bash_profile, etc. The relationship between the above-mentioned files is described below and difference.

The operation mode of bash can be divided into login shell and non-login shell.

For example, we enter the user name and password through the terminal, and after logging in to the system, we get a login shell. And when we execute the following command ssh hadoop103 command, the command executed in hadoop103 is a non-login shell.

The difference between a login shell and a non-login shell.

 

The main difference between these two shells is that they load different configuration files when they start. When the login shell starts, it loads /etc/profile, ~/.bash_profile, ~/.bashrc. ~/.bashrc is loaded when a non-login shell starts.

When loading ~/.bashrc (actually /etc/bashrc loaded in ~/.bashrc) or /etc/profile, the following code fragments will be executed, so whether it is a login shell or a non-login shell, it will be loaded at startup Environment variables in /etc/profile.d/*.sh.

2. View scripts for all processes in the cluster

1) Create the script xcall.sh in the /home/atguigu/bin directory

[atguigu@hadoop102 bin]$ vim xcall.sh

2) Write the following in the script

#! /bin/bash

for i in hadoop102 hadoop103 hadoop104

do

echo --------- $i ----------

ssh $i "$*"

done

3) Modify the script execution permission

[atguigu@hadoop102 bin]$ chmod 777 xcall.sh

4) Start script

[atguigu@hadoop102 bin]$ xcall.sh jps

3. Zookeeper installation

3.1 Distributed installation and deployment

1 ) Cluster planning

Deploy Zookeeper on three nodes of hadoop102, hadoop103 and hadoop104.

server hadoop102

server hadoop103

server hadoop104

Zookeeper

Zookeeper

Zookeeper

Zookeeper

2 ) Unzip and install

(1) Unzip the Zookeeper installation package to the /opt/module/ directory

[atguigu@hadoop102 software]$ tar -zxvf apache-zookeeper-3.7.1-bin.tar.gz -C /opt/module/

(2) Modify the name of /opt/module/apache-zookeeper-3.7.1-bin to zookeeper

[atguigu@hadoop102 module]$ mv apache-zookeeper-3.7.1-bin/ zookeeper

3 ) Configure the server number

(1) Create zkData in the /opt/module/zookeeper/ directory

[atguigu@hadoop102 zookeeper]$ mkdir zkData

(2) Create a myid file in the /opt/module/zookeeper/zkData directory

[atguigu@hadoop102 zkData]$ vim myid

Add the myid file, be careful to create it in linux , it is likely to be garbled in notepad++

Add the number corresponding to the server in the file:

2

4 ) Configure the zoo.cfg file

(1) Rename zoo_sample.cfg in the /opt/module/zookeeper/conf directory to zoo.cfg

[atguigu@hadoop102 conf]$ mv zoo_sample.cfg zoo.cfg

(2) Open the zoo.cfg file

[atguigu@hadoop102 conf]$ vim zoo.cfg

Modify data storage path configuration

dataDir=/opt/module/zookeeper/zkData

Add the following configuration

#######################cluster##########################

server.2=hadoop102:2888:3888

server.3=hadoop103:2888:3888

server.4=hadoop104:2888:3888

(3) Synchronize the contents of the /opt/module/zookeeper directory to hadoop103, hadoop104

[atguigu@hadoop102 module]$ xsync zookeeper/

(4) Modify the contents of the myid file on hadoop103 and hadoop104 to 3 and 4 respectively

(5) Interpretation of zoo.cfg configuration parameters

server.A=B:C:D。

A is a number, indicating which server number this is;

Configure a file myid in the cluster mode. This file is in the dataDir directory. There is a data in this file that is the value of A. Zookeeper reads this file when it starts, and compares the data in it with the configuration information in zoo.cfg to judge Which server is it ?

B is the address of this server;

C is the port where the Follower of this server exchanges information with the Leader server in the cluster;

D is in case the Leader server in the cluster hangs up, a port is needed to re-elect to elect a new Leader, and this port is the port used to communicate between the servers during the election.

5 ) Cluster operation

(1) Start Zookeeper separately

[atguigu@hadoop102 zookeeper]$ bin/zkServer.sh start

[atguigu@hadoop103 zookeeper]$ bin/zkServer.sh start

[atguigu@hadoop104 zookeeper]$ bin/zkServer.sh start

(2) View status

[atguigu@hadoop102 zookeeper]# bin/zkServer.sh status

JMX enabled by default

Using config: /opt/module/zookeeper/bin/../conf/zoo.cfg

Mode: follower

[atguigu@hadoop103 zookeeper]# bin/zkServer.sh status

JMX enabled by default

Using config: /opt/module/zookeeper/bin/../conf/zoo.cfg

Mode: leader

[atguigu@hadoop104 zookeeper]# bin/zkServer.sh status

JMX enabled by default

Using config: /opt/module/zookeeper/bin/../conf/zoo.cfg

Mode: follower

3.2 ZK cluster start and stop script

1 ) Create a script in the /home/atguigu/bin directory of hadoop102

[atguigu@hadoop102 bin]$ vim zk.sh

        Write the following in the script.

#!/bin/bash



case $1 in

"start"){

   for i in hadoop102 hadoop103 hadoop104

   do

        echo ---------- zookeeper $i 启动 ------------

      ssh $i "/opt/module/zookeeper/bin/zkServer.sh start"

   done

};;

"stop"){

   for i in hadoop102 hadoop103 hadoop104

   do

        echo ---------- zookeeper $i 停止 ------------   

      ssh $i "/opt/module/zookeeper/bin/zkServer.sh stop"

   done

};;

"status"){

   for i in hadoop102 hadoop103 hadoop104

   do

        echo ---------- zookeeper $i 状态 ------------   

      ssh $i "/opt/module/zookeeper/bin/zkServer.sh status"

   done

};;

esac

2 ) Increase the script execution permission

[atguigu@hadoop102 bin]$ chmod 777 zk.sh

3 ) Zookeeper cluster startup script

[atguigu@hadoop102 module]$ zk.sh start

4 ) Zookeeper cluster stop script

[atguigu@hadoop102 module]$ zk.sh stop

3.3 Client command line operation

command basic syntax

Functional description

help

show all action commands

ls path

Use the ls command to view the child nodes of the current znode

-w monitor child node changes

-s append secondary information

create

common creation

-s contains sequence

-e Temporary (reboot or timeout disappears)

get path

get the value of the node

-w monitor node content changes

-s append secondary information

set

Set the specific value of the node

stat

View node status

delete

delete node

deleteall

delete node recursively

1 ) Start the client

[atguigu@hadoop103 zookeeper]$ bin/zkCli.sh

Guess you like

Origin blog.csdn.net/lxwssjszsdnr_/article/details/131593643