This case plan:
Original cluster:
5 PCs
Operating System: RedHat Enterprise Server 6.5
root user and password: password
hadoop user and password: password
After adding the cluster:
4 PCs
Operating System: RedHat Enterprise Server 6.5
root user and password: password
hadoop user and password: password
1Add userAdd users to each server in the cluster:
[root@hadoop006 ~]#useradd hadoop
[root@hadoop006 ~]#passwd hadoop
Note: add in new node
[root@hadoop001 ~]#vim /etc/hosts
132.194.43.180 hadoop001
132.194.43.181 hadoop002
132.194.43.182 hadoop003
132.194.43.183 hadoop004
132.194.43.184 hadoop005 hivemysql
132.194.41.186 hadoop006
132.194.41.187 hadoop007
132.194.41.188 hadoop008
132.194.41.189 hadoop009
[root@hadoop001 ~]#vim /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=hadoop001
Note: The restart takes effect (service network restart). If it does not take effect, you can only restart the server manually to make it take effect.
Configure SSH mutual trust between root users of all servers and mutual trust between hadoop users.
The following uses the root user as an example, and the hadoop user configuration is the same.
[root@hadoop001 ~]$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): [Enter key]
Enter passphrase (empty for no passphrase): [Enter key]
Enter same passphrase again: [Enter key]
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
cd:5a:2a:bb:4a:49:97:8a:2d:70:19:18:60:56:9c:78 root@hadoop001
The key's randomart image is:
+---[ RSA 2048]----+
|+o+.. |
|o+ E |
|. the |
| o. o |
|. o . o S + |
| the + + + |
| o =. o |
| o o |
| ..O. |
[root@hadoop001 ~]$ ssh-copy-id [email protected]
[email protected]'s password:
Now try logging into the machine, with "ssh '[email protected]'", and check in:
.ssh/authorized_keys
to make sure we haven't added extra keys that you weren't expecting.
[root@hadoop001 ~]$ ssh 132.194.41.186
[root@hadoop001 ~]$ exit
logout
Connection to132.194.41.186 closed.
Note: other new nodes are the same as above (41.186,187,188,189)
Note: add on all new nodes
[root@hadoop00 6~]#vim /etc/ntp.conf
[root@hadoop00 6~]#service ntpd start
This step is run under the root user.
All nodes need to create this directory to store HDFS data and temporary data.
The master node only needs to create a directory, and the data node needs to create a directory and mount an independent disk.
The number of directories created depends on the number of mounted disks. In this example, there are three.
[root@hadoop00 1~]#mkdir -p /data/{disk1,disk2,disk3 }
[root@hadoop00 1~]#chown -R hadoop:hadoop /data/{disk1,disk2,disk3 }
[root@hadoop00 6~]#vim /etc/security/limits.conf, add at the end
hadoop soft nofile 131072
hadoop hard nofile 131072
[root@hadoop00 6~]#vim /etc/security/limits.d/90-nproc.conf, add at the end
hadoop soft nproc unlimited
hadoop hard nproc unlimited
7.1, software package
Extract the beh package to the /opt directory. And change the /opt/beh directory to the hadoop owner.
[root@hadoop001 ~]#scp /opt/ beh.tar.gz /root@hadoop006:/opt/
[root@hadoop006 ~]#tar -zxvf /opt/beh.tar.gz
[root@hadoop006 ~]#chown -R hadoop:hadoop /opt/beh
7.2, modify environment variables
Modify the following files corresponding to the user: (new node added)
[root@hadoop006 ~]#vim /home/hadoop/.bash_profile, add
source /opt/beh/conf/beh_env
source /opt/beh/conf/mysql_env
7.3, configuration file
7.3.1、spark
##This file is all worker nodes, generally add all datanodes, each line
[root@hadoop001 ~]#vim /opt/beh/core/spark/conf/slaves
hadoop003
hadoop004
hadoop005
hadoop006
hadoop007
hadoop008
hadoop009
7.3.2、HDFS:
This step can be operated under hadoop. The red font is the place that needs to be modified according to the environment.
[hadoop@hadoop001~]#vim /opt/beh/core/hadoop/etc/hadoop/slaves
##The hostname in this file is the hostname of all datanode data nodes, one per line
##In this example, the datanode node is hadoop003~hadoop009
hadoop003
hadoop004
hadoop005
hadoop006
hadoop007
hadoop008
hadoop009
7.3.2、yarn:
[hadoop@hadoop001~]#vim /opt/beh/core/hadoop/etc/hadoop/yarn-site.xml
<property>
<name>yarn.nodemanager.local-dirs</name> <value>/data/disk1/tmp/yarn/local,/data/disk2/tmp/yarn/local,/data/disk3/tmp/yarn/local </value>
<final>false</final>
</property>
This path should be the same as the actual disk,
7.4. Network performance test
Ø The test tool is nerperf. Test with hadoop001 as the server side. Other nodes test the transfer rate between the client and hadoop001. The test command is as follows
Ø hadoop001 is:
Restart the service:
[root@hadoop001~]#/opt/beh/mon/netperf/bin/netserver
Ø hadoop002-hadoop012:
test:
[root@hadoop001~]#/opt/beh/mon/netperf/bin/netperf -H 132.194.43.170 -t TCP_STREAM -l 10
[hadoop @hadoop001~]#cd /opt/beh/core/hadoop/bin
8.1、YARN
Execute on hadoop001:
[hadoop@hadoop001 ~]$start-yarn.sh
Start resourcemanager on hadoop002:
[hadoop@hadoop001 ~]$yarn-daemon.sh start resourcemanager
Start jobhistory server on hadoop001
[hadoop@hadoop001~]$mr-jobhistory-daemon.sh start historyserver
source: