First, build Hadoop3.X
1.1 Configuration Server
A master node: centos100 (192.168.65.128),
2 (from) the sub-node: centos101 (192.168.65.129), centos102 (192.168.65.130)
Configuring the primary node name
192.168.65.128 vi /etc/sysconfig/network
Add Content:
NETWORKING=yes
HOSTNAME=centos100
1.3 Configuring two child node name
192.168.65.129 vi /etc/sysconfig/network
Add Content:
NETWORKING=yes
HOSTNAME=centos101
192.168.65.130 vi /etc/sysconfig/network
Add Content:
NETWORKING=yes
HOSTNAME=centos102
Configuring hosts
Open the master node hosts file, the first two lines of the file you want to comment out (comment information for the current host) and add a host cluster hadoop all the information in the file.
vi /etc/hosts
Add Content:
192.168.65.128 centos100
192.168.65.129 centos101
192.168.65.130 centos102
After saving, the master node hosts are copied to the other two sub-nodes
scp /etc/hosts [email protected]:/etc/
scp /etc/hosts [email protected]:/etc/
And then were executed (to restart the server can not execute the following statement): / bin / hostsname hostsname
For example: the implementation of / bin / hostsname master the master, bring it into force.
1.5 Configuring ssh access without password
1.5.1 generates a public key pair
1, were performed on each node:
ssh-keygen -t rsa
Press Enter until the end has been generated
After the execution /root/.ssh/ directory on each node swells into two files id_rsa and id_rsa.pub
Wherein the former is a private key, which is a public key
2 is performed on the primary node:
cp id_rsa.pub authorized_keys
1.5.2 生成authorized_keys
Arrangement herein may be a variety of steps, this time on the master node generates select the authorized_keys , whose ultimate aim is on each node /root/.ssh/authorized_keys file contains a public key generated by the contents of all the nodes.
1, the two child nodes of the public key to copy the master node, respectively, performed on the two child nodes:
scp /root/.ssh/id_rsa.pub root@centos128:/root/.ssh/id_rsa_centos101.pub
scp /root/.ssh/id_rsa.pub root@centos128:/root/.ssh/id_rsa_centos102.pub
2, and then on the master node, the two public keys copied file to merge into authorized_keys
Performed on the master node:
cat id_rsa_centos101.pub>> authorized_keys
cat id_rsa_centos102.pub>> authorized_keys
3, the last test whether the configuration
Were executed on centos100
ssh centos101
ssh centos102
Jump to correctly interface to two child nodes, the same sign in the master node and other child nodes in the same way in each child node can log on without a password normal configuration is successful.
1.5.3 copy authorized_keys
A copy command scp authorized_keys file to the appropriate location on the primary node of the child node
scp authorized_keys root@centos101:/root/.ssh/
scp authorized_keys root@centos102:/root/.ssh/
1.6 installation jdk
Uninstall 1.6.1 jdk
1, view the system already installed jdk:
rpm -qa | grep jdk
2, unloading jdk:
rpm -e --nodeps copy-jdk-configs-3.3-2.el7.noarch
rpm -e --nodeps java-1.8.0-openjdk-headless-1.8.0.161-2.b14.el7.x86_64
rpm -e --nodeps java-1.7.0-openjdk-1.7.0.171-2.6.13.2.el7.x86_64
rpm -e --nodeps java-1.8.0-openjdk-1.8.0.161-2.b14.el7.x86_64
rpm -e --nodeps java-1.7.0-openjdk-headless-1.7.0.171-2.6.13.2.el7.x86_64
1.6.2 JDK installation
Three machines were required to install
1. Create a directory
cd /
cd /opt/
mkdir java
CD java
rz
2, download JDK
3, extract the JDK: tar -zxvf jdk-8u73-linux-x64.gz
4, configure the environment variables
vi / etc / profile
Add the following code at the end of the profile file:
export JAVA_HOME=/opt/java/jdk1.8.0_73
export JRE_HOME=$JAVA_HOME/jre
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib
5, environment variables to take effect:
source /etc/profile
6, to test whether the installation was successful: java -version
1.7 Installation hadoop
Hadoop installed on the master host
Custom installation location, for example, installed under the / usr / tools directory
1, download hadoop package, placed in the / usr / tools directory
2, extract hadoop
tar -zxvf hadoop-3.0.0tar.gz
Generating hadoop-3.0.0 usr directory below
3, configure the environment variables:
vi / etc / profile
At the end add:
export HADOOP_HOME=/usr/tools/hadoop-3.0.0
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
4, environmental variables:
source /etc/profile
1.8 Configuration hadoop
1.8.1 hadoop configuration profiles
You need to configure the file location is /hadoop-3.0.0/etc/hadoop, need to change are the following
hadoop-env.sh
yarn-env.sh
core-site.xml
hdfs-site.xml
mapred-site.xml
yarn-site.xml
workers
among them
hadoop-env.sh and yarn-env.sh which should add jdk environment variable:
1, hadoop-env.sh added:
export JAVA_HOME=/opt/java/jdk1.8.0_73
2, yarn-env.sh in (Hadoop3.X version without having to configure this)
export HADOOP_YARN_USER=${HADOOP_YARN_USER:-yarn}
export YARN_CONF_DIR="${YARN_CONF_DIR:-$HADOOP_YARN_HOME/conf}"
export JAVA_HOME=/opt/java/jdk1.8.0_73
3, core-site.xml in
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://centos100:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/usr/temp</value>
</property>
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
</configuration>
4, hdfs-site.xml in
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.web.ugi</name>
<value>supergroup</value>
</property>
</configuration>
5、mapred-site.xml中先cp mapred-site.xml.template mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>centos100:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>centos100:19888</value>
</property>
</configuration>
6, yarn-site.xml in
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>centos100:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>centos100:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>centos100:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>centos100:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>centos100:8088</value>
</property>
</configuration>
7, workers in
centos100
centos101
centos102
1.8.2 Copy files to install the child node hadoop
Performed on the master node:
scp -r /usr/tools/hadoop-3.0.0 root@centos101:/usr/tools
scp -r /usr/tools/hadoop-3.0.0 root@centos102:/usr/tools
Copy profile to a child node
Performed on the master node:
scp /etc/profile root@centos101:/etc/
scp /etc/profile root@centos102:/etc/
On the two children were to take effect for the new profile:
source /etc/profile
namenode 1.8.3 formatted master node
Entering hadoop-3.0.0 directory on master node
Then execute:
./bin/hadoop namenode -format
The new version not hadoop command with the following statement
./bin/hdfs namenode -format
Tip: successfully formatted representation format success
1.8.4 start hadoop
The master node performs at hadoop-3.0.0 directory:
./sbin/start-all.sh
On the primary node jps process are: 6
DataNode
Jps
SecondaryNameNode
NameNode
ResourceManager
NodeManager
jps process on each child has: 3
Jps
DataNode
NodeManager
If this represents a hadoop cluster configuration successfully