CDH 5.13.1 Installation Documentation

CDH 5.13.1 Installation  Documentation

Table of contents

CDH 5.13.1 Installation Documentation

1. Installation introduction

2. Download the required installation files

3. Confirm the host installation environment

4. Planning cluster deployment

5. Modify the hostname

6. Change the host HOSTS mapping file

7. Check hosting services

8. Check host NTP service configuration

9. Check the host parameter configuration:

10. Confirm the Python environment

11. Install dependencies

12. Install the configuration database required by the cluster and create users and databases

13. Create required system users

14. Create directory

15. Install Clouder Manger

16. Install JDK

 17. Initialize the configuration database of CM (executed only on CM Server)

 18. Configure Clouder Managerp Agent

19. Move the Parcel file to the indicator directory (executed only on CM Server)

20. Start CM Server/Agen

21. Log in to CM Server to complete the cluster installation


1.  Installation introduction

This article introduces  the installation of CDH. The installation version takes the latest version of  CDH 5.13.1  as an example.

CDH  provides automatic online installation and manual offline installation. The installation method described in this article is completely offline

Install. It is an installation method suitable for both test environment and production environment. The corresponding chapters of the official installation documentation are :

Installation Path C - Manual Installation Using Cloudera Manager Tarballs

2.  Download the required installation files

The offline installation of CDH  requires the following files :

CDH depends on Python:  https://www.python.org/ftp/python/2.7.11/Python-2.7.11.tgz

http://central.maven.org/maven2/mysql/mysql-connector-java/5.1.38/mysql-connecto

An ISO package of the ready-to-install system for installation when dependencies are missing.

3.  Confirm the host installation environment

1) Confirm whether the hardware of each server meets the requirements:

Space Requirements :

/var: 5GB

/usr : 500MB

CDH  installation directory : 2GB

Check command : df -h

Memory Requirements : 4GB

Check command : free -m

2) Check the system version:

Check command:  cat /etc/issue , required version: Red Hat Enterprise Linux Server release 6.x

3) Check the server data storage space:

The installation of CDH  recommends that each server have the same data storage path ,  and there can be more than one.

4.  Planning cluster deployment

CDH  recommends at least  3  servers for cluster deployment. This installation uses 6 servers. Planning:

CPU name

Role

ip

Master

Hadoop Master, CM Server,  Data Node

xxxxx

Slave1

Data Node,  Mysql, CMAgent

xxxxx

Slave2

Data Node , CM Agent

xxxxx

Slave3

Data Node , CM Agent

xxxxx

Slave4

Data Node , CM Agent

xxxxx

Slave5

Data Node , CM Agent

xxxxx

5.  Modify the host name

Change the host name of each host as required, the change method is :

1) Use the command to dynamically change the host name : hostname  new name ,  such as : hostname master

2) Change the system file to make the system restart also effective:

Change the file: /etc/sysconfig/network

Change the line: HOSTNAME=*** to : HOSTNAME=master  and save :

 Change the hostname of all other hosts in the same way

6.  Change the host  HOSTS  mapping file

 Write the IP  hostnames of all cluster servers into the /etc/hosts  file, because the following installation server selection

Both are selected by hostname.  For example: add the following to /etc/hosts of each server :

7.  Check hosting services

1) Stop the system firewall service command:

# service iptables stop

# chkconfig iptables off
2) Stop the system SELinux

Use the following command to check if SELinux is enabled:

 If the output is: Enable , then  SELinux  is enabled, then use the following operations to disable :

a) Change the system configuration file: /etc/sysconfig/selinux

Change the line SELINUX=enforcing to: SELINUX=disabled

b) Restart the system: reboot

8.  Check the host  NTP  service configuration

1) The operation of the distributed cluster requires the time synchronization of each server, check  whether  the ntpd service has been started :

a)# service ntpd status

b)# ntpq -p  Use this command to check  whether NTP has a correctly configured time server

2) If the ntpd service is not started or configured, add the time synchronization service in the following ways:

a) Add the line  "server  time server  IP"  to  /etc/ntp.conf  such as:

b) Restart the ntp  service :

# service ntpd restart

9.  Check the host parameter configuration:

1) Set the core parameters of  vm.swappiness  :

 Append line in /etc/sysctl.conf  file :

vm.swappiness = 0

To make the parameters take effect, execute the command:

# sysctl -p

2) To set  hugepage  related parameters, execute the following command:

# echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag

And add this command to the  /etc/rc.local  file to make it take effect when the system restarts.

10.  Confirm  the Python  environment

1) Determine whether  python  is installed and the version by the following command :

# python -V :

 The version requirements of Python are : 2.6.x, 2.7.x,  if it is not installed or the version is incorrect, install it in the following way

Pack:

a) Create the installation directory :

# mkdir /usr/local/python27

b) Unzip  the Python-2.7.11.tgz  installation file  downloaded in step  2 :

# tar -xvf Python-2.7.11.tgz

c) Enter the decompression directory, and then execute the following command to compile and install:

# cd Python-2.7.11

# ./configure --prefix=/usr/local/python27

Change  the Modules/Setup  file to the line

#zlib zlibmodule.c -I$(prefix)/include -L$(exec_prefix)/lib -lz

Change to :

zlib zlibmodule.c -I$(prefix)/include -L$(exec_prefix)/lib -lz

 In the previous comment last year, the zlib  module was added to be compiled , which will be used by the impala-shell  command.

# make

# make install

d) Modify the old version to point to :

# mv /usr/bin/python /usr/bin/python_old

# ln -s /usr/local/python27/bin/python /usr/bin/python

e) Use the above command again to check whether the installation is complete

11.  Install dependent packages

Use the following command to check whether the dependent system packages are installed:

# rpm -q chkconfig python bind-utils psmisc libxslt zlib sqlite cyrus-sasl-plain cyrus-sasl-gssapi

fuse fuse-libs redhat-lsb

For packages that are not installed, you can use  yum  installation or  rpm  manual installation to install them ,  such as the following using local

Install by yum :

a) Mount the system iso  package into a virtual device :

# mount -o loop rhel-server-6.3-x86_64-dvd.iso /iso

b) Make the iso as a local  YUM  source

Remove  other sources of  YUM :

# rm -rf /etc/yum.repos.d/*.repo

To add sources, add the following ( yellow lines are not ) to the file : /etc/yum.repos.d/rhel-source.repo :

[root@localhost Packages]# cat /etc/yum.repos.d/rhel-source.repo

[rhel-source]

name=Red Hat Enterprise Linux $releasever - $basearch - Source

baseurl=file:///iso

enabled=1

gpgcheck=0

Install missing dependencies such as :

12.  Install the configuration database required by the cluster and create users and databases

Cluster installation requires a relational database to store configuration data. CDH  supports  Oracle, Mysql,

Postgres,  and a built-in database. This will be installed using  mysql  for storage. The installation process is :

a) Unzip the downloaded  MYSQL  installation file :

# tar -xvf MySQL-5.6.29-1.el6.x86_64.rpm-bundle.tar

b) Uninstall the installed  MYSQL  software:

c) Install  MYSQL :

d)  The initial password of the MySQL root user is stored in the file  /root/.mysql_secret  .

e) Use the following command to start  MYSQL first:

# /usr/bin/mysqld_safe &

f) Change  the initial password of  the root user :

 The password prompted to enter is  the password in the  /root/.mysql_secret file

g)  Create the required database in  MYSQL :

mysql> CREATE DATABASE scm;

mysql> CREATE DATABASE hive;

mysql> CREATE DATABASE rm ;

mysql> CREATE DATABASE oozie;

h)  Unified login user authorization for creating clusters in MYSQL :

mysql> CREATE USER 'cdh'@'%' IDENTIFIED BY '123456';

Query OK, 0 rows affected (0.00 sec)

mysql> GRANT ALL ON *.* TO 'cdh'@'%';

Query OK, 0 rows affected (0.00 sec)

13.  Create the required system users

Create the required users in each server system with the following command:

#  useradd  --system  --home=/opt/cloudera-manager/cm-5.13.1/run/cloudera-scm-server

--no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm

*  Pay attention to the version number in the red field, which needs to be changed according to different versions

14.  Create directory

Use the following commands to create the required directories in each server system and authorize them:

# mkdir /opt/cloudera-manager

# mkdir -p /opt/cloudera/parcel-repo

# mkdir -p /opt/cloudera/parcels

# mkdir -p /var/log/cloudera-scm-headlamp

# mkdir -p /var/log/cloudera-scm-firehose

# mkdir -p /var/log/cloudera-scm-alertpublisher

# mkdir -p /var/log/cloudera-scm-eventserver

# mkdir -p /var/lib/cloudera-scm-headlamp

# mkdir -p /var/lib/cloudera-scm-firehose

# mkdir -p /var/lib/cloudera-scm-alertpublisher

# mkdir -p /var/lib/cloudera-scm-eventserver

# mkdir -p /var/lib/cloudera-scm-server

Authorization :

# chown cloudera-scm:cloudera-scm /opt/cloudera-manager

# chown -R cloudera-scm:cloudera-scm /var/log/cloudera-*

# chown -R cloudera-scm:cloudera-scm /var/lib/cloudera-*

# chown cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo

# chown cloudera-scm:cloudera-scm /opt/cloudera/parcels

15.  Install  Clouder Manger

a) Just unzip the downloaded  Clouder Manger  file package directly :

#tar xzf cloudera-manager*.tar.gz -C /opt/cloudera-manager

# chown cloudera-scm:cloudera-scm /opt/cloudera-manager

b) Copy the MYSQL JDBC  driver to the corresponding directory :

# mkdir -p /usr/share/java

# cp mysql-connector-java-5.1.38.jar /usr/share/java/mysql-connector-java.jar

# mkdir -p /usr/lib/hive/lib/

# cp mysql-connector-java-5.1.38.jar /usr/lib/hive/lib/mysql-connector-java.jar

16.  Install  JDK

Install the JDK downloaded above using the following command :

# rpm -ivh jdk-7u80-linux-x64.rpm

Configure the  JAVA  environment variable and  add the following to  /  etc/profile :

export JAVA_HOME=/usr/java/default

export CLASS_PATH=./:$JAVA_HOME/lib

export PATH=$JAVA_HOME/bin:$PATH

As shown in the figure, if you use the following command to check whether  the JDK  is installed successfully :

 1 7.  Initialize  the configuration database of  CM (  executed only on  CM Server )

Go to  the directory corresponding to  the cm decompressed in the previous step :

# cd /opt/cloudera-manager/cm-5.5.0/share/cmf/schema/    

Path : /opt/cm-5.13.1/share/cmf/schema

#Execute script  scm_prepare_database.sh  to initialize, command format :

./scm_prepare_database.sh -h MysqlHost -P MysqlPort dbType dbName dbUser dbPasswd

like:

 18.  Configure  Clouder Managerp Agent

Configure the CM Agent configuration file for each server  :

1) Go to the directory where the configuration file is located:

# cd /opt/cloudera-manager/cm-5.5.0/etc/cloudera-scm-agent

2) Change  the line in the  config.inif file :

 server_host=localhost Change the host where the corresponding cm server is located ,  such as :

19.  Move the  Parcel  file to the indicator directory (  executedonly on  CM Server )

 Move the 3  parcel  files into  parcel-repo with the following commands  :

# cp CDH-5.5.0-1.cdh5.5.0.p0.8-el6.parcel /opt/cloudera/parcel-repo/

# cp manifest.json /opt/cloudera/parcel-repo/

# cp CDH-5.5.0-1.cdh5.5.0.p0.8-el6.parcel.sha1 /opt/cloudera/parcel-repo/CDH-5.5.0-1.cdh5.5.0.p0.8-el6.parcel.sha

Note: The last file  changed the .sha1 name to  .sha

20.  Start  CM Server/Agen

Start  CM Server (  only needs to be executed on the  CM Server  server )

# /opt/cloudera-manager/cm-5.5.0/etc/init.d/cloudera-scm-server start

Start  CM Agent ( need to be executed on each machine, including  CM Server) :

# /opt/cloudera-manager/cm-5.5.0/etc/init.d/cloudera-scm-agent start

 It takes about 10 minutes to start  . After the startup is complete, you can start the following operations .

Available in :

/opt/cloudera-manager/cm-5.5.0/log/cloudera-scm-server

/opt/cloudera-manager/cm-5.5.0/log/cloudera-scm-agent

Find the program startup and running logs in the directory

21.  Log in to  CM Server  to complete the cluster installation

Note: Open the hosts file in C:\Windows\System32\drivers\etc of the local computer and add the ip relationship corresponding to the master to access it with the master

Log in to the CM management interface, use the address :  http://master:7180/

The default user password is admin/admin

 After logging in, pull down and continue:

 On the next page, choose to install  the cm  version, default, and continue:

 Choose to continue:

 Select 【Currently Managed Hosts】, you can see all the servers that have started  CM Agent  and successfully connected

CM Server  server ,  select the server to be installed, click continue :

 Select the corresponding  CDH parcel  version and click Continue :

 If there is no  Parcel  to choose from, please check step  18  and restart  CM Server

In this step, CM  will decompress  the parcel  , distribute it, and install it on each selected server, as shown in the figure:

 Check the correctness of the host, if there is an error, just follow the prompts to change, and click Finish:

Select the components to be installed, and you can customize the selection according to your needs :

 Choose a role assignment :

According to step  4  role planning: HDFS DataNode  selects all hosts

ZooKeeper  follows the suggestion ( more than  3  and singular hosts should be selected ):  select all hosts

Other changes as needed, pay attention to role balance and performance:

 Cluster database settings :

According to the database created in step  12  , fill in the correct database, user, and password, and then test the connection:

 If there is a database that does not exist, just follow step  12  to create a new database

Cluster data storage directory setting, set the data storage directory of each component as required:

 The cluster starts to install, deploy, and start services:

complete :

Guess you like

Origin blog.csdn.net/chunzhi128/article/details/124455241