CDH big data platform to build learning (5.10.0)

  Is a weekend, had already opened wegame, update it for a long time not to touch lol, then it can not wait, or think about writing a blog, just recently also learning CDH; just like the girls to buy things, like, no hesitate to buy a three cloud hosts, well, ado, here we started to build a large data CDH platform.

First, what Hadoop is?

  Hadoop is a distributed system architecture, developed by the Apache Foundation. The user can, without understanding the details of the underlying distributed, distributed application development and take advantage of high-speed computing power and the cluster storage.

   Shortcomings: ① confusion version management

        ② deployment process cumbersome, complicated upgrade process

        ③ poor compatibility

        ④ security is low

Second, the commercial version of Hadoop What?

  We have a common cloudera (CDH), hortonworks (HDP), mapR, IBM, Huawei, these companies are doing business version of hadoop.

Three, CDH introduction

① Introduction

  

    Personal feeling: a package of integrated environment for Hadoop.

② Why CDH version?

    Cloudera year round quarter hairstyle update version of the annual release Release versions, the speed is faster than the official Aapche, but also in actual use CDH very stable performance.

    CDH supports yum / apt package, tar package, rpm package, cloudera manager four ways to install. Get the latest features and the latest Bug fixes, easy installation and maintenance, operation and maintenance to save time. In addition to build a cluster easier.

     Advantages: 1) a clear division version

         2) version of the update speed

         3) Support Kerbero certification

         4) document clearly

         5) supports a variety of installation (Cloudera Manager mode)

③ Select version

       CDH4.x---->4.8.6

      CDH5.x ----> preferred 5.4.8 5.8.0 5.12.0 5.11.0 is not recommended to choose, there are pit

      We 5.10.0

Fourth, set up the preparatory work

  I am here because it is personal, it bought three cloud host for testing, the best memory is 8G or more.

   

   Cluster nodes planned as follows:    

      节点①:cm-agent、Datanode、SecondaryNameNode、NameNode 、Zookeeper

      节点②:mysql、cm-server、cm-agent、Namenode、DataNode、ResourceManager、NodeManager、Zookeeper

      Node ③ : cm & lt-Agent, DataNodes, the NodeManager, the Zookeeper

Five, CDH's formal installation

    Still the same, first of all summarize ideas:

    ① turn off the firewall, modify the host name, modify the hosts file, configuration-free dense Log (3 units)

    ② installation jdk, python (2.6.6), the server checks whether the synchronization between the time (table 3)

    ③ install mysql: mysql user group to change, modify my.cnf permissions, configuration, self-starting, start the monitoring process, change passwords, change .bash.profile file (hadoop01)

    ④ start service http: http installation services, create parcel file (hadoop01)

    ⑤ installation and start-CM Services (hadoop01)

    ⑥ Log CDH configuration interface to complete the configuration of the cluster.

 Sixth, the formal installation

    Because it is pure system, so we follow step by step installation from one step to the top.

① turn off the firewall (3 units)

  

② modify the host name (modified remember to restart after reboot !!! 3 units) 

    vim /etc/sysconfig/network

③ modified hosts file (table 3)

  vim / etc / hosts ( because the network assigned ip not on the same network segment, I am here to set the external network ip )

④ Free-tight configuration log (3 units)

  [root@hadoop01 ~]# ssh-keygen

  

 

  In turn enter the following command (3 hosts):

  [root@hadoop01 ~]# ssh-copy-id root@hadoop01

  [root@hadoop01 ~]# ssh-copy-id root@hadoop02

  [root@hadoop01 ~]# ssh-copy-id root@hadoop03

⑤安装jdk(3台):

  我这里选用的是jdk1.8:http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

  之后选择

  

  找到对应的jdk版本下载即可

  下载完成以后,通过rz命令上传文件至其中一台主机(如果没有此命令,可安装yum install lrzsz)

  通过远程拷贝至另外两台主机(scp ):

  [root@hadoop01 software]# scp jdk-8u181-linux-x64.tar.gz root@hadoop02:/home/software/

  

  

  解压:[root@hadoop01 software]# tar -xvf jdk-8u181-linux-x64.tar.gz

  为了更好配置环境变量,我们重命名:[root@hadoop01 software]# mv jdk1.8.0_181 jdk1.8

   配置环境变量:(记得source /etc/profile使配置文件生效

  export JAVA_HOME=/home/software/jdk1.8
  export PATH=.:$JAVA_HOME/bin:$PATH

  使用java  -version查看jdk是否配置成功。

  

⑥检查python版本(3台)

  python --version

  注:建议是2.6.6,如果使用的cdh版本是4.x,使用2.7.x版本的python会造成hdfs的ha不兼容

  虚拟机如果用的是centos7.x的话,要用python7.x的版本

  

⑦检查服务器的时间是否同步(3台)

  [root@hadoop01 software]# grep ZONE /etc/sysconfig/clock

⑧安装mysql(hadoop01)

(1)安装并解压 

  在hadoop001上安装Mysql数据库,这里数据库的版本是mysql-5.6.23-linux-glibc2.5-x86_64.tar.gz,将mysql安装包上传到服务器,或者从官网上下载mysql安装包.

  解压mysql安装包:

      tar xzvf mysql-5.6.23-linux-glibc2.5-x86_64.tar.gz

  解压完毕之后,将解压后的目录移动到/usr/local目录下(固定目录),并改名为mysql:

    [root@hadoop01 software]# mv mysql-5.6.23-linux-glibc2.5-x86_64 /usr/local/mysql

(2)改变mysql的用户组

  将mysql添加到mysqladmin的dba用户组里

  依次执行:

    cd   ~

    groupadd -g 101 dba

    useradd -u 514 -g dba -G root -d /usr/local/mysql mysqladmin

    id mysqladmin

    passwd mysqladmin(更改mysqladmin 用户的密码)

    cp /etc/skel/.* /usr/local/mysql(将环境变量配置文件拷贝到mysqladmin用户的home目录下)

  创建mysql的配置文件

  执行:

    cd /etc/

    vim my.cnf

    进入到my.cnf文件之后,讲里面的全部内容删除,之后将以下的配置拷贝到my.cnf中:  

[client]
port            = 3306
socket          = /usr/local/mysql/data/mysql.sock
 
[mysqld]
port            = 3306
socket          = /usr/local/mysql/data/mysql.sock

skip-external-locking
key_buffer_size = 256M
sort_buffer_size = 2M
read_buffer_size = 2M
read_rnd_buffer_size = 4M
query_cache_size= 32M
max_allowed_packet = 16M
myisam_sort_buffer_size=128M
tmp_table_size=32M

table_open_cache = 512
thread_cache_size = 8
wait_timeout = 86400
interactive_timeout = 86400
max_connections = 600

thread_concurrency = 32


default-storage-engine = INNODB
transaction-isolation = READ-COMMITTED

server-id  = 1
basedir     = /usr/local/mysql
datadir     = /usr/local/mysql/data
pid-file     = /usr/local/mysql/data/hostname.pid


log-warnings
sysdate-is-now

binlog_format = MIXED
log_bin_trust_function_creators=1
log-error  = /usr/local/mysql/data/hostname.err
log-bin=/usr/local/mysql/arch/mysql-bin

innodb_data_home_dir = /usr/local/mysql/data/
innodb_data_file_path = ibdata1:500M:autoextend
innodb_log_group_home_dir = /usr/local/mysql/arch
innodb_log_files_in_group = 2
innodb_log_file_size = 200M


innodb_buffer_pool_size = 1024M
innodb_additional_mem_pool_size = 50M
innodb_log_buffer_size = 16M

innodb_lock_wait_timeout = 100
innodb_flush_log_at_trx_commit = 1
innodb_locks_unsafe_for_binlog=1

performance_schema
innodb_read_io_threads=4
innodb-write-io-threads=4
innodb-io-capacity=200
innodb_purge_threads=1
innodb_use_native_aio=on

innodb_file_per_table = 1
lower_case_table_names=1

[mysqldump]
quick
max_allowed_packet = 16M

[mysql]
no-auto-rehash

[mysqlhotcopy]
interactive-timeout

[myisamchk]
key_buffer_size = 256M
sort_buffer_size = 256M
read_buffer = 2M
write_buffer = 2M

(3)修改my.cnf文件的属性和权限

  执行:

  chown  mysqladmin:dba /etc/my.cnf

     chmod  640 /etc/my.cnf  

       chown -R mysqladmin:dba /usr/local/mysql

      chmod -R 755 /usr/local/mysql

       su - mysqladmin

  执行完之后,看一下当前的路径 

执行:pwd

看一下是否在/usr/local/mysql路径下

执行:mkdir arch backup

执行初始化脚本:scripts/mysql_install_db  --user=mysqladmin --basedir=/usr/local/mysql --datadir=/usr/local/mysql/data(打印的日志没有报错,说明运行ok。

(4)配置mysql服务和自启动  

  在root用户下执行:cd /usr/local/mysql

  cp  /usr/local/mysql/support-files/mysql.server /etc/rc.d/init.d/mysql

  chmod +x /etc/rc.d/init.d/mysql

  chkconfig --del mysql

  chkconfig --add mysql

  chkconfig --level 345 mysql on

打开/etc/rc.local文件

执行:vim /etc/rc.local

将里面的内容都删掉,拷贝以下内容: 

#!/bin/sh
#
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don't
# want to do the full Sys V style init stuff.

touch /var/lock/subsys/local

su - mysqladmin -c "/etc/init.d/mysql start --federated"

(5)启动mysql并监听进程

执行:su - mysqladmin

  mysqld_safe &

执行完之后回车

 

执行:ps -ef|grep mysqld

查看mysql的进程是否运行

执行:service mysql status

 

(6)修改mysql的密码  

执行:mysql

进入到mysql的控制台

执行:use mysql

update user set password=password('root') where user='root';

这里将mysql的账号密码都设置为root

执行:select host,user,password from user;

将空字段删掉;

执行:delete from user where user='';

重新查询一遍:select host,user,password from user;

空字段删掉ok

执行:flush privileges;

退出mysql控制台;

(8)更改.bash_profile文件

进入到mysql目录中,执行vim ./.bash_profile,

拷贝以下内容:

# .bash_profile
# Get the aliases and functions

if [ -f ~/.bashrc ]; then
        . ~/.bashrc
fi

# User specific environment and startup programs
MYSQL_BASE=/usr/local/mysql
export MYSQL_BASE
PATH=${MYSQL_BASE}/bin:$PATH
export PATH

unset USERNAME

#stty erase ^H
set umask to 022
umask 022
PS1=`uname -n`":"'$USER'":"'$PWD'":>"; export PS1

⑨启动http和启动http服务

(1)安装http服务  

切换到root用户下,
执行:rpm -qa|grep httpd
yum install -y httpd
chkconfig --list|grep httpd
日志显示:httpd 0:off 1:off 2:off 3:off 4:off 5:off 6:off
执行:chkconfig httpd on
chkconfig --list|grep httpd
日志显示:httpd 0:off 1:off 2:on 3:on 4:on 5:on 6:off
执行:service httpd start

(2)创建parcels文件 

  下载:https://www.cloudera.com/downloads/cdh/5-10-0.html

  

执行:cd /var/www/html

  mkdir parcels

  cd parcels

在parcels目录下将成都

CDH-5.10.0-1.cdh5.10.0.p0.41-el6.parcel.sha,

CDH-5.10.0-1.cdh5.10.0.p0.41-el6.parcel,

manifest.json

这三个文件拷贝过来。

执行mkdir /opt/rpminstall

  cd /opt/rpminstall

文件cm5.10.0-centos6.tar.gz拷贝过来

执行tar -xzvf cm5.10.0-centos6.tar.gz -C /var/www/html/

  cd /var/www/html

  ll

日志显示:total 8

drwxrwxr-x 3 1106  592 4096 Oct 27 10:09 cm

drwxr-xr-x 2 root root 4096 Apr  2 15:55 parcels 

创建和官网一样的目录

执行:mkdir -p cm5/redhat/6/x86_64/

  mv cm cm5/redhat/6/x86_64/

配置本地yum源,

执行vi /etc/yum.repos.d/cloudera-manager.repo

以下内容拷贝到文件中,ip地址为当前机器ip地址,如果集群在内网中则配置内网ip即可,该文件每台服务器都要配置一个

[cloudera-manager]

name = Cloudera Manager, Version 5.10.0

baseurl = http://117.50.39.81/cm5/redhat/6/x86_64/cm/5/

gpgcheck = 0

退出保存

 

浏览器查看下面两个网址是否出来,假如有,就配置成功(以下ip为公网ip)

http://117.50.39.81/parcels/

http://117.50.39.81/cm5/redhat/6/x86_64/cm/5/

 ⑩安装CM服务

  安装rpm

执行:cd /var/www/html/cm5/redhat/6/x86_64/cm/5/RPMS/x86_64
yum install -y cloudera-manager-daemons-5.10.0-1.cm5100.p0.85.el6.x86_64.rpm
yum install -y cloudera-manager-server-5.10.0-1.cm5100.p0.85.el6.x86_64.rpm
顺序不能错,只装这两个

 

Guess you like

Origin www.cnblogs.com/rmxd/p/11330325.html