Prepare a clean virtual machine
[root@hdp-01 ~]# vi /etc/sysconfig/network-scripts/ifcfg-ens33
ONBOOT = no change ONBOOT = yes
Then restart the network service: sudo service network restart
[root@hdp-01 ~]# mkdir apps
[root@hdp-01 ~]# tar -zxvf jdk-8u152-linux-x64.tar.gz -C apps/
[root@hdp-01 ~]# vi /etc/profile
In the final document added:
export JAVA_HOME=/root/apps/jdk1.8.0_152
export PATH=$PATH:$JAVA_HOME/bin
After editing, remember the source / etc / profile to validate the configuration
hdp-01 as a template and clone four virtual machines were hdp-02 hdp-03 hdp-04
Windows hosts file configuration
C:\Windows\System32\drivers\etc
HDP-01 192 168 137 138
192168137139 HDP-02
HDP-03 192 168 137 140
HDP-04 192 168 137 141
Configuring SSH-free dense Login
Configure the hosts on each server or virtual machine, enter the command line:
vi /etc/hosts
In which all add ip server or virtual machine node and the corresponding domain name
HDP-01 192 168 137 138
192168137139 HDP-02
HDP-03 192 168 137 140
HDP-04 192 168 137 141
Enter ssh-keygen has a carriage return hdp-01 in
Copy the public key to other nodes
ssh-copy-id -i .ssh/id_rsa.pub root@hdp-01
ssh-copy-id -i .ssh/id_rsa.pub root@hdp-02
ssh-copy-id -i .ssh/id_rsa.pub root@hdp-03
ssh-copy-id -i .ssh/id_rsa.pub root@hdp-04
Free copy is complete you can achieve close login test:
ssh 0.0.0.0
Download hadoop-2.8.4.tar.gz to h01
wget www-us.apache.org/dist/hadoop/common/hadoop-2.8.4/hadoop-2.8.4.tar.gz
[root@hdp-01 ~]# tar -zxvf hadoop-2.8.4.tar.gz -C apps/
[root@hdp-01 apps]# mv hadoop-2.8.4/ hadoop
Modify the configuration file
Hadoop specified default file system: hdfs
The designated hdfs namenode nodes which machine
Local directory specified namenode software stored metadata
Datanode specified local directory file storage software blocks
hadoop configuration file: / root / apps / hadoop / etc / hadoop /
[root@hdp-01 ~]# cd apps/hadoop/etc/hadoop
[root@hdp-01 hadoop]# vi hadoop-env.sh
Modify hadoop-env.sh
export JAVA_HOME=/root/apps/jdk1.8.0_152
Modified core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hdp-01:9000</value>
</property>
</configuration>
Modify hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>/root/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/root/dfs/data</value>
</property>
</configuration>
[root@hdp-01 ~]# cd apps/hadoop/share/
[root@hdp-01 share]# rm -rf doc/
Copy the entire directory to another machine installed hadoop
scp -r /root/apps/hadoop hdp-02:/root/apps/
scp -r /root/apps/hadoop hdp-03:/root/apps/
scp -r /root/apps/hadoop hdp-04:/root/apps/
Start HDFS
Tip: To run hadoop command, you need to configure HADOOP_HOME and PATH environment variable in linux environment
vi / etc / profile
export JAVA_HOME=/root/apps/jdk1.8.0_150
export HADOOP_HOME=/root/apps/hadoop
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
source /etc/profile
scp -r /etc/profile hdp-02:$PWD
scp -r /etc/profile hdp-03:$PWD
scp -r /etc/profile hdp-04:$PWD
First, initialize the metadata catalog namenode
To execute a command hadoop on hdp-01 to initialize namenode metadata storage directory
[root@hdp-01 ~]# hadoop namenode -format
Create a new metadata store directory
Generating a metadata file recorded fsimage
Creating a cluster of related identity: such as: cluster id - clusterID
Then, start namenode process (on hdp-01)
Turn off the firewall
[root@hdp-01 ~]# hadoop-daemon.sh start namenode
After the start, first with jps look namenode the process exists
The browser then visit namenode provided with windows web port: 50070
Then, they start all the datanode (anywhere)
hadoop-daemon.sh start datanode
Automatic startup batch script to start HDFS
Modify hadoop installation directory / etc / hadoop / slaves (nodes need to start the process datanode included)
[root@hdp-01 ~]# vi apps/hadoop/etc/hadoop/slaves
HDP-01
HDP-02
HDP-03
HDP-04
Script on hdp-01: start-dfs.sh to automatically start the entire cluster
If you want to stop, then use the script: stop-dfs.sh