CDH build a big data platform (5.10.0)

  Tangled for a long time, or spend three 4-core 8G Ali cloud host, 且行且珍惜, officially entered the process of building a cluster hadoop CDH below.

  The following installation process is relatively long, so be patient.

A, CDH Introduction

  The following is the official website to introduce: personal feeling is hadoop package to inherit the environment

Second, why choose CDH?

  Cloudera year round quarter hairstyle update version of the annual release Release versions, the speed is faster than the official Aapche, but also in actual use CDH very stable performance.

  CDH supports yum / apt package, tar package, rpm package, cloudera manager four ways to install. Get the latest features and the latest Bug fixes, easy installation and maintenance, operation and maintenance to save time. In addition to build a cluster easier.

  • a clear division version
  • version update speed
  • Support for Kerberos security authentication
  • Document clear
  • Support for a variety of installation (Cloudera Manager mode)

Three, CDH version selection  

  CDH4.x--->4.8.6

  CDH5.x: Preferred 5.4.8 5.8.0 5.12.0  is not recommended to choose 5.11.0, pit ( here I use 5.10.0 )

Fourth, the installation ready

1. node is ready

  Because it is a personal test environment, so the purchase of the three Ali cloud host, the main configuration is as follows:

  

2. Node Planning

  hadoop001:mysql cm-server cm-agent Namenode DataNode ResourceManager NodeManager  ZK

  hadoop002:cm-agent Datanode SecondaryNameNode NameNode  ZK

  hadoop003:cm-agent DataNode NodeManager ZK

3. Download the file parcels

    Address: http: //archive.cloudera.com/cdh5/parcels/

    Select 5.10.0: http: //archive.cloudera.com/cdh5/parcels/5.10.0/

    Download the following three elements:

      ①http://archive.cloudera.com/cdh5/parcels/5.10.0/CDH-5.10.0-1.cdh5.10.0.p0.41-el6.parcel

      ②http://archive.cloudera.com/cdh5/parcels/5.10.0/CDH-5.10.0-1.cdh5.10.0.p0.41-el6.parcel.sha1

      ③http://archive.cloudera.com/cdh5/parcels/5.10.0/manifest.json

4.tarball Download

  Address: http: //archive.cloudera.com/cm5/repo-as-tarball

  选择5.10.0:http://archive.cloudera.com/cm5/repo-as-tarball/5.10.0/

  Download: http: //archive.cloudera.com/cm5/repo-as-tarball/5.10.0/cm5.10.0-centos6.tar.gz

Fifth, the official installation

1. Turn off the firewall (Node 3)

  Temporarily shut down: service iptables stop verification: service iptables status

  Permanent closure: chkconfig iptables off verification: chkconfig --list | grep iptable

  

Configuring the host name (Node 3)

  Execute the command: vim / etc / sysconfig / network

  After modifications are complete reboot: reboot

  

3. Modify the hosts file (Node 3)

  Execute the command: vim / etc / hosts

  Add the following content ( consistent with the contents of three nodes ), here I am using a network ip

  

4. Free-tight configuration log (Node 3)

  Run: ssh-keygen

  执行命令: ssh-copy-id root@hadoop01、 ssh-copy-id root@hadoop02、 ssh-copy-id root@hadoop03

5. Install the JDK (3 nodes)

  I have downloaded a good local file here: rz command by local upload (yum install lrzsz)

  Unzip command: [root @ hadoop03 java] # tar -xvf jdk-8u181-linux-x64.tar.gz

  Configuration environment variable: [root @ java] # vim / etc / profile

  Configure the following:     

#jdk environment variable configuration
export JAVA_HOME = / usr / java / jdk1.8.0_181 // path here must be / usr / java, or CDH failed to start! ! ! ! ! !
export PATH =:. $ JAVA_HOME / bin: $ PATH

  Run the configuration file to take effect: [root @ hadoop03 java] # source / etc / profile

  Scp command by the JDK compressed packet to other nodes: [root @ hadoop03 java] # scp jdk-8u181-linux-x64.tar.gz root @ hadoop01: / usr / java /

  最后通过:java -version命令查看JDK是否安装成功。

6.检查Python版本(3个节点)

  执行命令:python --version

  注:建议是2.6.6,如果使用的cdh版本是4.x,使用2.7.x版本python会造成hdfs的ha不兼容

    虚拟机如果用的是centos7.x的话,要用python7.x的版本

7.检查服务器之间的时间是否同步(3个节点)

  执行命令:grep ZONE /etc/sysconfig/clock(应该都是上海时间)

8.安装mysql(hadoop01节点)

8.1安装并解压

  这里数据库的版本是mysql-5.6.23-linux-glibc2.5-x86_64.tar.gz,将mysql安装包上传到服务器,或者从官网上下载mysql安装包.

  解压mysql安装包:tar xzvf mysql-5.6.23-linux-glibc2.5-x86_64.tar.gz

  解压完毕之后,将解压后的目录移动到/usr/local目录下(固定目录)并改名为mysql:mv mysql-5.6.23-linux-glibc2.5-x86_64 /usr/local/mysql

8.2改变mysql的用户组

  将mysql添加到mysqladmin的dba用户组里,执行以下命令:

  [root@hadoop01 software]# cd ~

  [root@hadoop01 ~]# groupadd -g 101 dba

  [root@hadoop01 ~]# useradd -u 514 -g dba -G root -d /usr/local/mysql mysqladmin

  [root@hadoop01 ~]# id mysqladmin(查看用户)

  [root@hadoop01 ~]# passwd mysqladmin(修改密码)

8.3将环境变量配置文件拷贝到mysqladmin用户的home目录下

  执行命令:cp /etc/skel/.* /usr/local/mysql 

8.4创建mysql的配置文件

  执行以下命令:  

  [root@hadoop01 ~]# cd /etc/

  [root@hadoop01 etc]# vim my.cnf

  进入到my.cnf文件之后,将里面的全部内容删除,之后将以下的配置拷贝到my.cnf中:  

[client]
port            = 3306
socket          = /usr/local/mysql/data/mysql.sock
 
[mysqld]
port            = 3306
socket          = /usr/local/mysql/data/mysql.sock

skip-external-locking
key_buffer_size = 256M
sort_buffer_size = 2M
read_buffer_size = 2M
read_rnd_buffer_size = 4M
query_cache_size= 32M
max_allowed_packet = 16M
myisam_sort_buffer_size=128M
tmp_table_size=32M

table_open_cache = 512
thread_cache_size = 8
wait_timeout = 86400
interactive_timeout = 86400
max_connections = 600

thread_concurrency = 32


default-storage-engine = INNODB
transaction-isolation = READ-COMMITTED

server-id  = 1
basedir     = /usr/local/mysql
datadir     = /usr/local/mysql/data
pid-file     = /usr/local/mysql/data/hostname.pid


log-warnings
sysdate-is-now

binlog_format = MIXED
log_bin_trust_function_creators=1
log-error  = /usr/local/mysql/data/hostname.err
log-bin=/usr/local/mysql/arch/mysql-bin

innodb_data_home_dir = /usr/local/mysql/data/
innodb_data_file_path = ibdata1:500M:autoextend
innodb_log_group_home_dir = /usr/local/mysql/arch
innodb_log_files_in_group = 2
innodb_log_file_size = 200M


innodb_buffer_pool_size = 1024M
innodb_additional_mem_pool_size = 50M
innodb_log_buffer_size = 16M

innodb_lock_wait_timeout = 100
innodb_flush_log_at_trx_commit = 1
innodb_locks_unsafe_for_binlog=1

performance_schema
innodb_read_io_threads=4  
innodb-write-io-threads=4
innodb-io-capacity=200
innodb_purge_threads=1
innodb_use_native_aio=on

innodb_file_per_table = 1
lower_case_table_names=1

[mysqldump]
quick
max_allowed_packet = 16M

[mysql]
no-auto-rehash

[mysqlhotcopy]
interactive-timeout

[myisamchk]
key_buffer_size = 256M
sort_buffer_size = 256M
read_buffer = 2M
write_buffer = 2M

8.5修改my.cnf文件的属性和权限

  依次执行以下命令:

  [root@hadoop01 etc]# chown mysqladmin:dba /etc/my.cnf
  [root@hadoop01 etc]# chmod 640 /etc/my.cnf
  [root@hadoop01 etc]# chown -R mysqladmin:dba /usr/local/mysql
  [root@hadoop01 etc]# chmod -R 755 /usr/local/mysql
  [root@hadoop01 etc]# su - mysqladmin
  [mysqladmin@hadoop01 ~]$ pwd(执行完之后,看一下当前的路径)
  /usr/local/mysql
  [mysqladmin@hadoop01 ~]$ mkdir arch backup
  [mysqladmin@hadoop01 ~]$ scripts/mysql_install_db --user=mysqladmin --basedir=/usr/local/mysql --datadir=/usr/local/mysql/data(执行初始化脚本,打印的日志没有报错,说明运行ok

   

8.6配置mysql服务和自启动  

  在root用户下执行 

  [mysqladmin@hadoop01 ~]$ su root
  Password:
  [root@hadoop01 mysql]#
  [root@hadoop01 mysql]# cd /usr/local/mysql
  [root@hadoop01 mysql]# cp /usr/local/mysql/support-files/mysql.server /etc/rc.d/init.d/mysql
  [root@hadoop01 mysql]# chmod +x /etc/rc.d/init.d/mysql
  [root@hadoop01 mysql]# chkconfig --del mysql
  [root@hadoop01 mysql]# chkconfig --add mysql
  [root@hadoop01 mysql]# chkconfig --level 345 mysql on
  [root@hadoop01 mysql]# vim /etc/rc.local(将里面的内容都删掉,拷贝以下内容

#!/bin/sh
#
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don't
# want to do the full Sys V style init stuff.

touch /var/lock/subsys/local

su - mysqladmin -c "/etc/init.d/mysql start --federated"

8.7启动mysql并监听进程

  执行以下命令

  [root@hadoop01 mysql]# su - mysqladmin
  [mysqladmin@hadoop01 ~]$ mysqld_safe &
  [1] 1888

  重新打开一个连接执行:

  ps -ef|grep mysqld(查看mysql的进程是否运行)

  service mysql status(查看mysql的运行状态)

   

  出现上图代表启动ok

8.8修改mysql的密码

  执行以下命令:

  mysql> use mysql

  mysql> update user set password=password('root') where user='root';

  mysql> select host,user,password from user;

  mysql> delete from user where user='';

  mysql> flush privileges;

  

8.9更改.bash_profile文件

 进入到mysql目录中,执行vim ./.bash_profile,拷贝以下内容:

  [root@hadoop01 mysql]# cd /usr/local/mysql/

  [root@hadoop01 mysql]# vim .bash_profile

# .bash_profile
# Get the aliases and functions

if [ -f ~/.bashrc ]; then
        . ~/.bashrc
fi

# User specific environment and startup programs
MYSQL_BASE=/usr/local/mysql
export MYSQL_BASE
PATH=${MYSQL_BASE}/bin:$PATH
export PATH

unset USERNAME

#stty erase ^H
set umask to 022
umask 022
PS1=`uname -n`":"'$USER'":"'$PWD'":>"; export PS1

9.启动http和启动http服务

1.安装http服务

  切换到root用户: 

  [root@hadoop01 mysql]# rpm -qa|grep httpd
  [root@hadoop01 mysql]# yum install -y httpd

  [root@hadoop01 mysql]# chkconfig --list|grep httpd

  日志显示httpd           0:off   1:off   2:off   3:off   4:off   5:off   6:off

  [root@hadoop01 mysql]# chkconfig httpd on

  [root@hadoop01 mysql]# chkconfig --list|grep httpd

  日志显示httpd           0:off   1:off   2:on    3:on    4:on    5:on    6:off

  [root@hadoop01 mysql]# service httpd start

2.创建parcels文件

  执行以下命令:

  [root@hadoop01 mysql]# cd /var/www/html

  [root@hadoop01 html]#  mkdir parcels

  将开始下载的三个文件上传至此文件夹下:

  ①http://archive.cloudera.com/cdh5/parcels/5.10.0/CDH-5.10.0-1.cdh5.10.0.p0.41-el6.parcel

  ②http://archive.cloudera.com/cdh5/parcels/5.10.0/CDH-5.10.0-1.cdh5.10.0.p0.41-el6.parcel.sha1

  ③http://archive.cloudera.com/cdh5/parcels/5.10.0/manifest.json

  [root@hadoop01 parcels]# mkdir /opt/rpminstall

  [root@hadoop01 parcels]# cd /opt/rpminstall

  将下载的tarball上传:cm5.10.0-centos6.tar.gz至当前目录下

  解压:[root@hadoop01 rpminstall]# tar -xzvf cm5.10.0-centos6.tar.gz -C /var/www/html/

  [root@hadoop01 rpminstall]# cd /var/www/html
  [root@hadoop01 html]# ll

  

  创建和官网相同的目录:

  [root@hadoop01 html]# mkdir -p cm5/redhat/6/x86_64/
  [root@hadoop01 html]# mv cm cm5/redhat/6/x86_64/

3.配置本地yum源(3个节点)

  [root@hadoop01 ~]# vi /etc/yum.repos.d/cloudera-manager.repo

  粘贴以下内容:ip地址为当前机器ip地址,如果集群在内网中则配置内网ip即可,该文件每台服务器都要配置一个,保存退出!

[cloudera-manager]
name = Cloudera Manager, Version 5.10.0
baseurl = http://10.9.9.27/cm5/redhat/6/x86_64/cm/5/
gpgcheck = 0

  浏览器查看下面两个网址是否出来,假如有,就配置成功以下ip为公网ip

  http://39.100.73.64/parcels/

  

  http://39.100.73.64/cm5/redhat/6/x86_64/cm/5/

  

10.安装CM服务

  执行以下命令:

  [root@hadoop01 ~]# cd /var/www/html/cm5/redhat/6/x86_64/cm/5/RPMS/x86_64

  [root@hadoop01 x86_64]#  yum install -y cloudera-manager-daemons-5.10.0-1.cm5100.p0.85.el6.x86_64.rpm

  [root@hadoop01 x86_64]#   yum install -y cloudera-manager-server-5.10.0-1.cm5100.p0.85.el6.x86_64.rpm

  顺序不能错,只装这两个

  [root@hadoop01 x86_64]# mkdir /usr/share/java

  [root@hadoop01 x86_64]# cd /usr/share/java/

  将mysql-connector-java.jar上传到该目录下:.jar包名称必须为mysql-connector-java.jar

  进入到mysql中,创建元数据:

  执行以下命令:

  [root@hadoop01 java]# su - mysqladmin 

  hadoop01:mysqladmin:/usr/local/mysql:>cd bin

  

  进入数据库后,执行以下命令:

  mysql>  create database cmf DEFAULT CHARACTER SET utf8;

  mysql> grant all on cmf.* TO 'cmf'@'%' IDENTIFIED BY 'root';

  mysql> create database amon DEFAULT CHARACTER SET utf8;

  mysql> grant all on amon.* TO 'amon'@'%' IDENTIFIED BY 'root';

  mysql> grant all privileges on *.* to 'root'@'%' identified by 'root' with grant option;

  mysql> flush privileges;  

  切换到root用户:

  [root@hadoop01 ~]# cd /etc/cloudera-scm-server/

  [root@hadoop01 cloudera-scm-server]# vi db.properties(按照下图中进行配置)

标注内容从上往下分别代表:数据库类型,数据库所在的主机ip:端口,数据库名称,数据库用户,数据库设置类型

标注要和你之前的配置匹配

配置好后,保存退出

11.启动CM服务

  执行以下命令:

  [root@hadoop01 jdk1.8.0_181]# service cloudera-scm-server start

  

 查看日志: 

  [root@hadoop01 jdk1.8.0_181]# cd /var/log/cloudera-scm-server/
  [root@hadoop01 cloudera-scm-server]# tail -f cloudera-scm-server.log

  没有错误日志提示,说明启动成功~~~~~

  

 12.登录CDH配置页面

  http://39.100.73.64:7180(ip为公网ip),用户名和密码都是admin,下面正式开始我们的页面配置过程。(需要进入阿里云控制台,将公网ip的端口开放:7180

12.1选择免费

12.2 配置CDH集群

12.3点击搜索 

  出现这个页面,代表集群能连接上,当前受管这一栏全部为否,如果有是的话,代表之前已经安装好并且没有卸载干净,需要卸载干净后重启服务后在进入到该页面。

12.4配置parcels文件

12.4.1点击更多选项

12.4.2配置远程 Parcel 存储库 URL

  进入到该页面中,远程 Parcel 存储库 URL这一栏删掉只留下一个,将内容更改为之前配置过的parcel地址,这里用的是内网的ip,所以是http://39.100.73.64/parcels/,点击保存

12.4.3选择版本和自定义存储库

12.4.4不勾选JDK

12.4.5不勾选单用户模式

12.4.6设置主机密码

12.4.7agent客户端安装

  等待agent客户端安装,这一步可能会出现各种问题,通过点击出现问题的服务器的详细信息查看出现问题的地方并更改之后重启服务重新安装

 12.5等待安装完成后,点击继续

12.6等待安装分配完成后,点击继续

12.7继续等待检查主机

12.8这里出现了警告,下面解决警告

透明大页面和swap值需要更改

将每台机器关闭大页面

执行以下命令:

  在每个节点执行以下命令: 

  [root@hadoop01 cloudera-scm-server]# echo never > /sys/kernel/mm/transparent_hugepage/defrag
  [root@hadoop01 cloudera-scm-server]# echo never > /sys/kernel/mm/transparent_hugepage/enabled
  [root@hadoop01 cloudera-scm-server]# echo 'echo never > /sys/kernel/mm/transparent_hugepage/defrag'>> /etc/rc.local
  [root@hadoop01 cloudera-scm-server]# echo 'echo never > /sys/kernel/mm/transparent_hugepage/enabled'>> /etc/rc.local  

  [root@hadoop01 cloudera-scm-server]# echo 'vm.swappiness = 10' >> /etc/sysctl.conf

  [root@hadoop01 cloudera-scm-server]# sysctl –p

12.9配置好以上命令以后,点击重新运行

12.10至此验证完成,点击完成按钮

12.11集群设置

选择自定义服务,这里安装HDFS,YARN和Zookeeper,勾选好后点击继续。

12.12角色分配

根据我们之前的配置计划,选择好安装的节点有哪些(这是我的节点规划,仅供参考,实际以个人需求为主),之后点击继续。

12.13选择数据库

这里匹配我们之前建好amon数据库,点击测试连接,测试成功以后,点击继续。

12.14审核更改

全部默认,不要动,点击继续

12.15首次运行命令

这里根据我们的设置进行安装相关的服务等,继续等待最后的安装,安装完成后,点击继续。

12.16出现以下界面,说明cdh搭建大数据平台成功!

12.17进入主页

 这里就是安装完成后的可视化界面,在此可以通过界面来安装其他服务,比如:HBase、Spark等等;也可以看每个节点的运行状态等。

  最后,很感谢大家能看到这里,如果有什么问题,我们大家一起留言讨论一下!

Guess you like

Origin www.cnblogs.com/rmxd/p/11343704.html