CDH 5.13.1 Installation Documentation
Table of contents
CDH 5.13.1 Installation Documentation
2. Download the required installation files
3. Confirm the host installation environment
4. Planning cluster deployment
6. Change the host HOSTS mapping file
8. Check host NTP service configuration
9. Check the host parameter configuration:
10. Confirm the Python environment
12. Install the configuration database required by the cluster and create users and databases
13. Create required system users
17. Initialize the configuration database of CM (executed only on CM Server)
18. Configure Clouder Managerp Agent
19. Move the Parcel file to the indicator directory (executed only on CM Server)
21. Log in to CM Server to complete the cluster installation
1. Installation introduction
This article introduces the installation of CDH. The installation version takes the latest version of CDH 5.13.1 as an example.
CDH provides automatic online installation and manual offline installation. The installation method described in this article is completely offline
Install. It is an installation method suitable for both test environment and production environment. The corresponding chapters of the official installation documentation are :
Installation Path C - Manual Installation Using Cloudera Manager Tarballs
2. Download the required installation files
The offline installation of CDH requires the following files :
CDH depends on Python: https://www.python.org/ftp/python/2.7.11/Python-2.7.11.tgz
JDK: http://download.oracle.com/otn/java/jdk/7u80-b15/jdk-7u80-linux-x64.rpm
Mysql5.6:
http://dev.mysql.com/get/Downloads/MySQL-5.6/MySQL-5.6.29-1.el6.x86_64.rp
CM:
http://archive.cloudera.com/cm5/cm/5/cloudera-manager-el6-cm5.5.0_x86_64.tar.gz
Parcel:
https://archive.cloudera.com/cdh5/parcels/5.5.0/CDH-5.5.0-1.cdh5.5.0.p0.8-el6.parcel
https://archive.cloudera.com/cdh5/parcels/5.5.0/CDH-5.5.0-1.cdh5.5.0.p0.8-el6.parcel.
https://archive.cloudera.com/cdh5/parcels/5.13.1/manifest.json
Mysql JDBC:
http://central.maven.org/maven2/mysql/mysql-connector-java/5.1.38/mysql-connecto
An ISO package of the ready-to-install system for installation when dependencies are missing.
3. Confirm the host installation environment
1) Confirm whether the hardware of each server meets the requirements:
Space Requirements : /var: 5GB /usr : 500MB CDH installation directory : 2GB |
Check command : df -h |
Memory Requirements : 4GB |
Check command : free -m |
2) Check the system version:
Check command: cat /etc/issue , required version: Red Hat Enterprise Linux Server release 6.x
3) Check the server data storage space:
The installation of CDH recommends that each server have the same data storage path , and there can be more than one.
4. Planning cluster deployment
CDH recommends at least 3 servers for cluster deployment. This installation uses 6 servers. Planning:
CPU name |
Role |
ip |
|
Master |
Hadoop Master, CM Server, Data Node |
xxxxx |
|
Slave1 |
Data Node, Mysql, CMAgent |
xxxxx |
|
Slave2 |
Data Node , CM Agent |
xxxxx |
|
Slave3 |
Data Node , CM Agent |
xxxxx |
|
Slave4 |
Data Node , CM Agent |
xxxxx |
|
Slave5 |
Data Node , CM Agent |
xxxxx |
5. Modify the host name
Change the host name of each host as required, the change method is :
1) Use the command to dynamically change the host name : hostname new name , such as : hostname master
2) Change the system file to make the system restart also effective:
Change the file: /etc/sysconfig/network
Change the line: HOSTNAME=*** to : HOSTNAME=master and save :
Change the hostname of all other hosts in the same way
6. Change the host HOSTS mapping file
Write the IP hostnames of all cluster servers into the /etc/hosts file, because the following installation server selection
Both are selected by hostname. For example: add the following to /etc/hosts of each server :
7. Check hosting services
1) Stop the system firewall service command:
# service iptables stop
# chkconfig iptables off
2) Stop the system SELinux
Use the following command to check if SELinux is enabled:
If the output is: Enable , then SELinux is enabled, then use the following operations to disable :
a) Change the system configuration file: /etc/sysconfig/selinux
Change the line SELINUX=enforcing to: SELINUX=disabled
b) Restart the system: reboot
8. Check the host NTP service configuration
1) The operation of the distributed cluster requires the time synchronization of each server, check whether the ntpd service has been started :
a)# service ntpd status
b)# ntpq -p Use this command to check whether NTP has a correctly configured time server
2) If the ntpd service is not started or configured, add the time synchronization service in the following ways:
a) Add the line "server time server IP" to /etc/ntp.conf such as:
b) Restart the ntp service :
# service ntpd restart
9. Check the host parameter configuration:
1) Set the core parameters of vm.swappiness :
Append line in /etc/sysctl.conf file :
vm.swappiness = 0
To make the parameters take effect, execute the command:
# sysctl -p
2) To set hugepage related parameters, execute the following command:
# echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag
And add this command to the /etc/rc.local file to make it take effect when the system restarts.
10. Confirm the Python environment
1) Determine whether python is installed and the version by the following command :
# python -V :
The version requirements of Python are : 2.6.x, 2.7.x, if it is not installed or the version is incorrect, install it in the following way
Pack:
a) Create the installation directory :
# mkdir /usr/local/python27
b) Unzip the Python-2.7.11.tgz installation file downloaded in step 2 :
# tar -xvf Python-2.7.11.tgz
c) Enter the decompression directory, and then execute the following command to compile and install:
# cd Python-2.7.11
# ./configure --prefix=/usr/local/python27
Change the Modules/Setup file to the line
#zlib zlibmodule.c -I$(prefix)/include -L$(exec_prefix)/lib -lz
Change to :
zlib zlibmodule.c -I$(prefix)/include -L$(exec_prefix)/lib -lz
In the previous comment last year, the zlib module was added to be compiled , which will be used by the impala-shell command.
# make
# make install
d) Modify the old version to point to :
# mv /usr/bin/python /usr/bin/python_old
# ln -s /usr/local/python27/bin/python /usr/bin/python
e) Use the above command again to check whether the installation is complete
11. Install dependent packages
Use the following command to check whether the dependent system packages are installed:
# rpm -q chkconfig python bind-utils psmisc libxslt zlib sqlite cyrus-sasl-plain cyrus-sasl-gssapi
fuse fuse-libs redhat-lsb
For packages that are not installed, you can use yum installation or rpm manual installation to install them , such as the following using local
Install by yum :
a) Mount the system iso package into a virtual device :
# mount -o loop rhel-server-6.3-x86_64-dvd.iso /iso
b) Make the iso as a local YUM source
Remove other sources of YUM :
# rm -rf /etc/yum.repos.d/*.repo
To add sources, add the following ( yellow lines are not ) to the file : /etc/yum.repos.d/rhel-source.repo :
[root@localhost Packages]# cat /etc/yum.repos.d/rhel-source.repo
[rhel-source]
name=Red Hat Enterprise Linux $releasever - $basearch - Source
baseurl=file:///iso
enabled=1
gpgcheck=0
Install missing dependencies such as :
12. Install the configuration database required by the cluster and create users and databases
Cluster installation requires a relational database to store configuration data. CDH supports Oracle, Mysql,
Postgres, and a built-in database. This will be installed using mysql for storage. The installation process is :
a) Unzip the downloaded MYSQL installation file :
# tar -xvf MySQL-5.6.29-1.el6.x86_64.rpm-bundle.tar
b) Uninstall the installed MYSQL software:
c) Install MYSQL :
d) The initial password of the MySQL root user is stored in the file /root/.mysql_secret .
e) Use the following command to start MYSQL first:
# /usr/bin/mysqld_safe &
f) Change the initial password of the root user :
The password prompted to enter is the password in the /root/.mysql_secret file
g) Create the required database in MYSQL :
mysql> CREATE DATABASE scm;
mysql> CREATE DATABASE hive;
mysql> CREATE DATABASE rm ;
mysql> CREATE DATABASE oozie;
h) Unified login user authorization for creating clusters in MYSQL :
mysql> CREATE USER 'cdh'@'%' IDENTIFIED BY '123456';
Query OK, 0 rows affected (0.00 sec)
mysql> GRANT ALL ON *.* TO 'cdh'@'%';
Query OK, 0 rows affected (0.00 sec)
13. Create the required system users
Create the required users in each server system with the following command:
# useradd --system --home=/opt/cloudera-manager/cm-5.13.1/run/cloudera-scm-server
--no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm
* Pay attention to the version number in the red field, which needs to be changed according to different versions
14. Create directory
Use the following commands to create the required directories in each server system and authorize them:
# mkdir /opt/cloudera-manager
# mkdir -p /opt/cloudera/parcel-repo
# mkdir -p /opt/cloudera/parcels
# mkdir -p /var/log/cloudera-scm-headlamp
# mkdir -p /var/log/cloudera-scm-firehose
# mkdir -p /var/log/cloudera-scm-alertpublisher
# mkdir -p /var/log/cloudera-scm-eventserver
# mkdir -p /var/lib/cloudera-scm-headlamp
# mkdir -p /var/lib/cloudera-scm-firehose
# mkdir -p /var/lib/cloudera-scm-alertpublisher
# mkdir -p /var/lib/cloudera-scm-eventserver
# mkdir -p /var/lib/cloudera-scm-server
Authorization :
# chown cloudera-scm:cloudera-scm /opt/cloudera-manager
# chown -R cloudera-scm:cloudera-scm /var/log/cloudera-*
# chown -R cloudera-scm:cloudera-scm /var/lib/cloudera-*
# chown cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo
# chown cloudera-scm:cloudera-scm /opt/cloudera/parcels
15. Install Clouder Manger
a) Just unzip the downloaded Clouder Manger file package directly :
#tar xzf cloudera-manager*.tar.gz -C /opt/cloudera-manager
# chown cloudera-scm:cloudera-scm /opt/cloudera-manager
b) Copy the MYSQL JDBC driver to the corresponding directory :
# mkdir -p /usr/share/java
# cp mysql-connector-java-5.1.38.jar /usr/share/java/mysql-connector-java.jar
# mkdir -p /usr/lib/hive/lib/
# cp mysql-connector-java-5.1.38.jar /usr/lib/hive/lib/mysql-connector-java.jar
16. Install JDK
Install the JDK downloaded above using the following command :
# rpm -ivh jdk-7u80-linux-x64.rpm
Configure the JAVA environment variable and add the following to / etc/profile :
export JAVA_HOME=/usr/java/default
export CLASS_PATH=./:$JAVA_HOME/lib
export PATH=$JAVA_HOME/bin:$PATH
As shown in the figure, if you use the following command to check whether the JDK is installed successfully :
1 7. Initialize the configuration database of CM ( executed only on CM Server )
Go to the directory corresponding to the cm decompressed in the previous step :
# cd /opt/cloudera-manager/cm-5.5.0/share/cmf/schema/
Path : /opt/cm-5.13.1/share/cmf/schema
#Execute script scm_prepare_database.sh to initialize, command format :
./scm_prepare_database.sh -h MysqlHost -P MysqlPort dbType dbName dbUser dbPasswd
like:
18. Configure Clouder Managerp Agent
Configure the CM Agent configuration file for each server :
1) Go to the directory where the configuration file is located:
# cd /opt/cloudera-manager/cm-5.5.0/etc/cloudera-scm-agent
2) Change the line in the config.inif file :
server_host=localhost Change the host where the corresponding cm server is located , such as :
19. Move the Parcel file to the indicator directory ( executedonly on CM Server )
Move the 3 parcel files into parcel-repo with the following commands :
# cp CDH-5.5.0-1.cdh5.5.0.p0.8-el6.parcel /opt/cloudera/parcel-repo/
# cp manifest.json /opt/cloudera/parcel-repo/
# cp CDH-5.5.0-1.cdh5.5.0.p0.8-el6.parcel.sha1 /opt/cloudera/parcel-repo/CDH-5.5.0-1.cdh5.5.0.p0.8-el6.parcel.sha
Note: The last file changed the .sha1 name to .sha
20. Start CM Server/Agen
Start CM Server ( only needs to be executed on the CM Server server )
# /opt/cloudera-manager/cm-5.5.0/etc/init.d/cloudera-scm-server start
Start CM Agent ( need to be executed on each machine, including CM Server) :
# /opt/cloudera-manager/cm-5.5.0/etc/init.d/cloudera-scm-agent start
It takes about 10 minutes to start . After the startup is complete, you can start the following operations .
Available in :
/opt/cloudera-manager/cm-5.5.0/log/cloudera-scm-server
/opt/cloudera-manager/cm-5.5.0/log/cloudera-scm-agent
Find the program startup and running logs in the directory
21. Log in to CM Server to complete the cluster installation
Note: Open the hosts file in C:\Windows\System32\drivers\etc of the local computer and add the ip relationship corresponding to the master to access it with the master
Log in to the CM management interface, use the address : http://master:7180/
The default user password is admin/admin
After logging in, pull down and continue:
On the next page, choose to install the cm version, default, and continue:
Choose to continue:
Select 【Currently Managed Hosts】, you can see all the servers that have started CM Agent and successfully connected
CM Server server , select the server to be installed, click continue :
Select the corresponding CDH parcel version and click Continue :
If there is no Parcel to choose from, please check step 18 and restart CM Server
In this step, CM will decompress the parcel , distribute it, and install it on each selected server, as shown in the figure:
Check the correctness of the host, if there is an error, just follow the prompts to change, and click Finish:
Select the components to be installed, and you can customize the selection according to your needs :
Choose a role assignment :
According to step 4 role planning: HDFS DataNode selects all hosts
ZooKeeper follows the suggestion ( more than 3 and singular hosts should be selected ): select all hosts
Other changes as needed, pay attention to role balance and performance:
Cluster database settings :
According to the database created in step 12 , fill in the correct database, user, and password, and then test the connection:
If there is a database that does not exist, just follow step 12 to create a new database
Cluster data storage directory setting, set the data storage directory of each component as required:
The cluster starts to install, deploy, and start services:
complete :