CDH installation part one: Cloudera Manager installation and deployment

Overview of Cloude Manager

CDH is an enterprise-level big data management platform used by Cloudera to monitor and manage the overall hadoop cluster environment.

Cloudera Manager is divided into:

  • Cloudera Manager Server: Provides monitoring and management operations for the entire cluster. Cloudera Manager Server manages the overall cluster through Cloudera Manager Agent deployed on different devices. Cloudera Manager Server needs to be deployed on one device.
  •  Cloudera Manager Agent: Deployed on every device that needs to be monitored and managed. Responsible for collecting operating data and executing issued management commands
  • DataBase: A relational database is a database that stores overall cluster status data when Cloudera Manager performs management operations.

installation requirements

  • The operating system used in this example is Centos7 x64. You need to select the corresponding version of the installation package according to your operating system.
  • Use the ROOT user to install; if you use other non-ROOT users to install, please ensure that the user has sudo permissions, and also ensure that the owner of all related files is this user.
  • Ensure that all hosts are fully connected to the network. That is to say, the firewall policy and SELinux policy must be fully understood.
  •  All execution orders in this case are marked with a green 5 New Roman
  • Also ensure that the JDK version is above 1.8.
  • Ensure that the memory of each device must be at least 8G.
  •  Ensure that the remaining space of the /var directory and /usr directory of each device is more than 30G. Data storage disks are mounted as required.
  • This example was tested with a minimal installation of the Centos7 x64 version.

Installation example introduction

Server basic parameters

IP

hostname

CPU

RAM

disk

Role

operating system

192.168.174.111

hdfs01

8C

128G

3T

server

Centos7 x64

192.168.174.112

hdfs02

8C

128G

3T

agent

Centos7 x64

192.168.174.113

hdfs03

8C

128G

3T

agent

Centos7 x64

192.168.174.114

hdfs04

8C

128G

3T

agent

Centos7 x64

192.168.174.115

hdfs05

8C

128G

3T

agent

Centos7 x64

192.168.174.116

hdfs06

8C

128G

3T

agent

Centos7 x64

192.168.174.117

hdfs07

8C

128G

3T

agent

Centos7 x64

192.168.174.118

hdfs08

8C

128G

3T

agent

Centos7 x64

For the installation of CDH, we'd better use the ROOT user to install it to avoid all kinds of troubles caused by the permissions of some directories and folders. Of course, if you want to use other non-ROOT users to install, but the user must have NOPASSWD permissions for sudo. Also, the owner of all related files and folders must be set to this user, except those that are modified individually.

The official latest stable version download address of CDH software package: http://archive.cloudera.com/cdh5/parcels/latest/

The following three files need to be downloaded at this address:

        CDH-5.14.0-1.cdh5.14.0.p0.24-el7.parcel                  

        CDH-5.14.0-1.cdh5.14.0.p0.24-el7.parcel.sha1

        manifest.json

Cloudera Manager official download address: http://archive.cloudera.com/cm5/cm/5/

Download only one file at this address (the file needs to be selected according to your own system):

        cloudera-manager-centos7-cm5.14.1_x86_64.tar.gz

Official installation reference documentation: https://www.cloudera.com/documentation/enterprise/5-14-x/topics/installation_installation.html

Operating system related configuration

turn off firewall

Because we want to build a cluster, there will be communication between the clusters. If there is communication between the servers, there must be a corresponding firewall policy open, so we need to close the firewall. The following is the command to close the firewall (operate under the root user):

# 检查防火墙状态
Centos6: [root@localhost ~]# service iptables status
Centos7: [root@localhost ~]# systemctl status firewalld.service
如果显示状态不是iptables: Firewall is not running.则需要关闭防火墙

# 关闭防火墙
Centos6:[root@localhost ~]# service iptables stop
Centos7:[root@localhost ~]# systemctl stop firewalld

# 永久关闭防火墙
Centos6:chkconfig iptables off
Centos7:systemctl disable firewalld.service

# 检查防火墙状态
Centos6: [root@localhost  ~]# service iptables status
Centos7:[root@localhost ~]# systemctl status firewalld.service
iptables: Firewall is not running.

close SElinux

Because all the access rights of centos are managed by SELinux, in order to avoid the failure caused by the permission relationship in our installation, we will close it, and then re-manage it as needed. The following gives the operation command to turn off SELinux (operate under the ROOT user):

# 查看SElinux的状态
[root@localhost ~]# /usr/sbin/sestatus –v
SELinux status:                 enabled
如果SELinux status参数为enabled即为开启状态,需要进行下面的关闭操作。

# 关闭SElinux
[root@localhost ~]# vim /etc/selinux/config
在文档中找到SELINUX,将SELINUX的值设置为disabled,即:
SELINUX=disabled

# 在内存中关闭SElinux
[root@localhost ~]# setenforce 0

# 检查内存中状态
[root@localhost ~]# getenforce
如果日志显示结果为disabled或者permissive,说明操作已经成功。

Configure yum source

This step is mainly to set the operating system installation package (ISO) as the YUM source to install more components. Execute on all hosts (upload the Centos ISO installation file to the /opt folder)

1. Mount the operating system ISO file to the specified directory

[root@localhost ~]# mkdir /mnt/iso
[root@localhost ~]# mount -o loop /opt/CentOS-7-x86_64-DVD-1511.iso /mnt/iso

Where CentOS-7-x86_64-DVD-1511.iso is the ISO file of CentOS7.2

2. Set up the yum source repo file

[root@localhost ~]# cd /etc/yum.repos.d
[root@localhost ~]# mkdir /opt/repo_bak;mv *.repo /opt/repo_bak
[root@localhost ~]# vi base.repo

Add the following code to the newly created base.repo file:

[base]
name=CentOS 7
baseurl=file:///mnt/iso
gpgcheck=0

3. Refresh yum

[root@localhost ~]# yum clean all
[root@localhost ~]# yum makecache

Install related dependencies

[root@localhost ~]# yum -y install chkconfig bind-utils psmisc libxslt zlib sqlite cyrus-sasl-plain cyrus-sasl-gssapi fuse portmap fuse-libs redhat-lsb httpd httpd-tools unzip ntp

start httpd service

[root@localhost ~]# systemctl start httpd.service
[root@localhost ~]# systemctl enable httpd.service #设置为开机启动

Configuring NTP Clock Synchronization

Set up a unified clock synchronization service on all devices where the CDH environment is to be installed. If we have a clock server, then we need to configure the NTP client on each device; if not, we use the server host as the clock server and configure the NTP server for the server host. other servers to synchronize this server's clock.

In this example, the server host is configured as an NTP server, and other hosts are configured as NTP clients. It would be easier if there were clock servers, all configured as NTP clients.

NTP server configuration (configured on the server host, if there is a clock server, the server host is also configured as a client)

Modify /etc/ntp.conf

Make the following modifications to the contents of the file:

1. Comment out all configurations starting with restrict

2. Find restrict 192.168.1.0 mask 255.255.255.0 nomodify notrap, uncomment it, and change the IP and mask to the real environment IP and mask. This line is configured to allow ntp client connections

3. Find server 0.centos.pool.ntp.org iburst and comment all server configurations

4. Add the following two lines

        server 127.127.1.0

        fudge 127.127.1.0 stratum 10

Start the NTP service

Execute the following command to start the ntp service

[root@localhost ~]# systemctl restart ntpd

View service status

After the service is started, use ntpq -p to view the service status. When the reach reaches a relatively large value (usually 17), configure the NTP client.

NTP client configuration (configured on the agent host)

Modify /etc/ntp.conf

Make the following modifications to the file:

1. Comment out all restrict and server configurations

2. Add the following note, you need to modify the following IP to be the IP of the NTP server (in this case, the IP of the server host)

        server 192.168.187.5

Synchronize time manually

In order to avoid the first synchronization time being slow, and to test whether our configuration is correct, we first use the following command to manually synchronize once.

[root@localhost ~]# ntpdate 192.168.187.51

Start the NTP service

[root@localhost ~]# systemctl restart ntpd

Set the ntp service of all hosts to start automatically at boot

centos6:[root@localhost ~]# chkconfig ntpd on
centos7:[root@localhost ~]# systenctl enable ntpd.service

Modify hostname

The reason for modifying the host name is to facilitate our memory and management, but this is not the main reason. The more important thing is to prevent the internal implementation mechanism of hadoop from routing to the IP of the host through the host name. We need to ensure that each machine is hostnames are not the same.

In this example, only the operation of the first server is used as an example, other servers are the same, but it should be noted that the hostname of each host must be different. (It is recommended to use the format hdfs1, hdfs2, hdfs3... . to name each server at once).

The following gives the operation command to modify the host name (operate under ROOT):

Centos6:
[root@localhost ~]# vi /etc/hostname
将内容修改为新的hostname

Centos7:
[root@localhost ~]# hostnamectl set-hostname hdfs1;
[root@localhost ~]# hostname hdfs1

执行完以上的命令退出,重新登录即可。

Underscores (_) cannot be used in hostnames.

hostname cannot protect uppercase characters

Set Host Routing (HOSTS)

There are two main reasons for modifying HOSTS:

1. In order to prevent the internal implementation mechanism of hadoop from accessing the host through the host name.

2. In order to make it more convenient for us to write in the configuration process, it looks clear at a glance.

It should be noted here that we configure HOSTS not only to configure the correspondence between the local IP and the host, but to configure the correspondence between the IP and the host name of all machines for each machine in our plan.

Modify the HOSTS method:

The following is the operation command to modify HOSTS (operate under the ROOT user):

Modify the /etc/hosts file and add the correspondence between the IP addresses and host names of all the hosts in the plan. And each machine is configured.

[root@hdfs1 ~]# vi /etc/hosts

Add the content in the following format to the file, which is the IP and host name of all the hosts in our plan, and the same content should be added to the HOSTS of each machine. The IP and the host name are separated by a TAB key .

192.168.186.101 hdfs1

192.168.186.102 hdfs2

192.168.186.103 hdfs3

……

If we want multiple names to be routed to the same IP, we just need to continue adding them later, also using the TAB key to separate them. E.g:

192.168.186.101    hdfs1      master     spark       hadoop

Installation of relational database

MySQL installation

The installation package used

        MySQL-client-5.6.26-1.el6.x86_64.rpm

        MySQL-devel-5.6.26-1.el6.x86_64.rpm

        MySQL-server-5.6.26-1.el6.x86_64.rpm

Uninstall MySQL and mariadb that comes with Centos

Execute the following two commands to view the pre-installed MySQL or mariadb on the system

[root@hdfs1 ~]# rpm -qa | grep  MySQL
[root@hdfs1 ~]# rpm -qa | grep  mariadb

Delete all installed components queried by the above command with the following command

[root@hdfs1 ~]# rpm -e --nodeps (以上命令查出来的所有包,以空格分开)

Use the following command to query all MySQL related files

[root@hdfs1 ~]# find / -name mysql

 Delete all files found by the above command

Install MySQL

Go to the directory of our MySQL installation package and execute the following command (I put all MySQL installation packages in the root user's home directory)

[root@hdfs1 ~]# rpm -ivh MySQL*

start MySQL

centos6:[root@hdfs1 ~]# service mysql start
centos7:[root@hdfs1 ~]# systemctl start mysql.service

initialization password

During the installation of MySQL, we will see the following printed:

A RANDOM PASSWORD HAS BEEN SET FOR THE MySQL root USER !You will find that password in ‘/root/.mysql_secret’.

We need to find the randomly generated root password under this file, use the following command:

[root@hdfs1 ~]# cat /root/.mysql_secret

Log in to MySQL's command console

[root@hdfs1 ~]# mysql –uroot –p密码  # 密码为我们在上一步中查看到的随机密码

Change the password of the MySQL root user

Execute the following SQL in the MySQL command console to reset the password of the root user to 123456

SET PASSWORD FOR  'root'@'localhost' = PASSWORD(‘123456’)

Modify the permissions of the MySQL ROOT user

grant all on *.* to root@"%" identified by "123456"

Set MySQL to start at startup

centos6:[root@hdfs1 ~]# chkconfig mysqld on
centos7:[root@hdfs1 ~]# systemctl enable mysqld.service

Edit the /etc/my.cnf file. Edit Please backup the my.cnf file. Set according to the following parameters:

[mysqld]

transaction-isolation = READ-COMMITTED

# Disabling symbolic-links is recommended to prevent assorted security risks;

# to do so, uncomment this line:

# symbolic-links = 0

 

key_buffer = 16M

key_buffer_size = 32M

max_allowed_packet = 32M

thread_stack = 256K

thread_cache_size = 64

query_cache_limit = 8M

query_cache_size = 64M

query_cache_type = 1

 

max_connections = 550

#expire_logs_days = 10

#max_binlog_size = 100M

 

#log_bin should be on a disk with enough free space. Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your system

#and chown the specified folder to the mysql user.

log_bin=/var/lib/mysql/mysql_binary_log

 

binlog_format = mixed

 

read_buffer_size = 2M

read_rnd_buffer_size = 16M

sort_buffer_size = 8M

join_buffer_size = 8M

 

# InnoDB settings

innodb_file_per_table = 1

innodb_flush_log_at_trx_commit  = 2

innodb_log_buffer_size = 64M

innodb_buffer_pool_size = 4G

innodb_thread_concurrency = 8

innodb_flush_method = O_DIRECT

innodb_log_file_size = 512M

 

[mysqld_safe]

log-error=/var/log/mariadb/mariadb.log

pid-file=/var/run/mariadb/mariadb.pid

symbolic-links一定要注释掉

初始化数据库

[root@lhdfs1 ~]# /usr/bin/mysql_secure_installation

按照如下方式初始化。此步骤会初始化root用户的密码,请记住初始化后的root用户的密码。

初始化相关数据库以及用户。在MYSQL命令行下执行如下SQL:

create database amon DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
create database rman DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
create database metastore DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
create database sentry DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
create database nav DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
create database navms DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
create database hive DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
create database hue DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
create database monitor DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
create database oozie DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
grant all on *.* to 'root'@'%' identified by '123456' with grant option;

JDK的安装

因为CDH的运行依赖JDK1.8的运行环境。所以在安装CDH之前一定要先安装JDK1.8。本示例中以在ROOT用户下安装JDK为例。

1、 下载并上传JDK1.8的安装包

将压缩包上传到任意目录,本文以ROOT用户的主目录(~)下为例

2、 解压到相应的安装目录下

本示例中将JDK安装在了/usr/local目录下。

[root@hdfs1 ~]# tar -zxvf jdk-8u131-linux-x64.tar.gz -C /usr/local/

3、 配置环境变量

将解压后的jdk的目录配置到环境变量中

[root@hdfs1 ~]# vi /etc/profile

在该文件的末尾处添加以下内容

export JAVA_HOME=/usr/local/jdk1.8.0_131

export PATH=$JAVA_HOME/bin:$PATH

4、刷新环境变量

[root@hdfs1 ~]# source /etc/profile

5、测试是否安装成功

在任意目录下执行一下命令

[root@hdfs1 ~]# java -version

如果出现Java的版本信息证明安装成功,如果未出现,请检查环境变量中配置的路径是否正确。

Cloudera Manager Server的安装

既然是server的安装,我们就只在server主机上执行以下步骤。

上传安装包

对于server的安装我们只需要以下安装介质

Cloudera Manager 安装包:cloudera-manager-centos7-cm5.12.1_x86_64.tar.gz

MySQL驱动包:mysql-connector-java-5.1.44-bin.jar

大数据离线安装库:   CDH-5.12.1-1.cdh5.12.1.p0.3-el7.parcel

                                 CDH-5.12.1-1.cdh5.12.1.p0.3-el7.parcel.sha

                                 manifest.json

以上的介质我们在[安装示例介绍]章节中都已经明确,本示例中上传到ROOT用户的主目录下。

创建安装目录并解压安装介质

[root@hdfs1 ~]# mkdir /opt/cloudera-manager
[root@hdfs1 ~]# tar xzf cloudera-manager*.tar.gz -C /opt/cloudera-manager

安装数据库驱动并初始化数据库

安装数据库驱动

[root@hdfs1 ~]# mkdir -p /usr/share/java
[root@hdfs1 ~]# cp mysql-connector-java-5.1.44-bin.jar /usr/share/java/mysql-connector-java.jar

初始化数据库

创建系统用户cloudera-scm

[root@hdfs1 ~]# useradd --system --home=/opt/cloudera-manager/cm-5.12.1/run/cloudera-scm-server --no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm

创建server存储目录

[root@hdfs1 ~]# mkdir /var/lib/cloudera-scm-server
[root@hdfs1 ~]# chown cloudera-scm:cloudera-scm /var/lib/cloudera-scm-server

创建hadoop离线安装包存储目录

[root@hdfs1 ~]# mkdir -p /opt/cloudera/parcels;
[root@hdfs1 ~]# chown cloudera-scm:cloudera-scm /opt/cloudera/parcels

配置agent的server指向

修改文件我方式有两种,当然,这不过是shell的功能而已,如果shell厉害,可能还会有更多的方法。

第一种方法:

[root@hdfs1 ~]# vi /opt/cloudera-manager/cm-5.12.1/etc/cloudera-scm-agent/config.ini
将server_host修改为cloudera manager server的主机名,对于本示例而言,也就是server主机。

第二种方法:

[root@dhdfs1 ~]# sed -i "s/server_host=localhost/server_host=hdfs1/" /opt/cloudera-manager/cm-5.12.1/etc/cloudera-scm-agent/config.ini

部署CDH离线安装包

[root@hdfs1 ~]# mkdir -p /opt/cloudera/parcel-repo;
[root@hdfs1 ~]# chown cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo;
[root@hdfs1 ~]# mv CDH-5.12.1-1.cdh5.12.1.p0.3-el7.parcel CDH-5.12.1-1.cdh5.12.1.p0.3-el7.parcel.sha manifest.json /opt/cloudera/parcel-repo/

启动Cloudera Manager Server

[root@hdfs1 ~]# /opt/cloudera-manager/cm-5.12.1/etc/init.d/cloudera-scm-server start

启动Cloudera Manager Agent

[root@hdfs1 ~]# /opt/cloudera-manager/cm-5.12.1/etc/init.d/cloudera-scm-agent start

Cloudera Manager Agent的安装

在除了server服务器外的其他的服务器都要执行以下步骤进行对agent的部署。

上传安装包

对于agent的安装我们只需要以下的两个安装介质

        Cloudera Manager 安装包:cloudera-manager-centos7-cm5.12.1_x86_64.tar.gz

        MySQL驱动包:mysql-connector-java-5.1.44-bin.jar

安装数据库驱动

[root@hdfs1 ~]# mkdir -p /usr/share/java
[root@hdfs1 ~]# cp mysql-connector-java-5.1.44-bin.jar /usr/share/java/mysql-connector-java.jar

创建安装目录并解压安装介质

[root@hdfs1 ~]# mkdir /opt/cloudera-manager
[root@hdfs1 ~]# tar xzf cloudera-manager*.tar.gz -C /opt/cloudera-manager

建系统用户cloudera-scm

[root@hdfs1 ~]# useradd --system --home=/opt/cloudera-manager/cm-5.12.1/run/cloudera-scm-server --no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm

创建hadoop离线安装包存储目录

[root@hdfs1 ~]# mkdir -p /opt/cloudera/parcels;
[root@hdfs1 ~]# chown cloudera-scm:cloudera-scm /opt/cloudera/parcels

配置agent的server指向

修改文件我方式有两种,当然,这不过是shell的功能而已,如果shell厉害,可能还会有更多的方法。

第一种方法:

[root@hdfs1 ~]# vi /opt/cloudera-manager/cm-5.12.1/etc/cloudera-scm-agent/config.ini
将server_host修改为cloudera manager server的主机名,对于本示例而言,也就是server主机。

第二种方法:

[root@dhdfs1 ~]# sed -i "s/server_host=localhost/server_host=hdfs1/" /opt/cloudera-manager/cm-5.12.1/etc/cloudera-scm-agent/config.ini

启动Cloudera Manager Agent

[root@hdfs1 ~]# /opt/cloudera-manager/cm-5.12.1/etc/init.d/cloudera-scm-agent start

Cloudera ManagerMent Service集群的安装

当我们部署完CDH的server和agent之后,我们的其他一切操作都在网页上进行操作。首先我们就要安装CDH的监控集群,它是用来监控我们整个CDH的所有主机和集群的运行状态的服务。所以安装很有必要。

但是有一点是,他的进程很多,非常占用内存,生产环境中一定不要和集群安装在一台机器上。我的安装部署原则是,server主机上部署所有Cloudera Manager 相关的组件(MySQL,Cloudera Manager Server, Cloudera ManagerMent Service的所有角色),而hadoop集群的所有角色都分配到agent中。

登录到CDH的管理页面

点击添加CLOUDERA MANAGERMENT SERVICE按钮

进入到角色分配页面,将所有的主机都选择server那台。server主机我们只安装CDH的管理工具。

点击继续,进入到数据库的配置页面,配置好各个参数之后点击测试,如果显示successful证明可以进行下一步。

点击继续,进入到告警发送配置页面,如果不需要完全可以默认。

点击继续,进入到安装启动页面,等待启动完成即可

点击继续

点击完成

安装总结

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324642031&siteId=291194637