1. Environmental selection
1. Server selection
Alibaba cloud server: entry-level (pay-as-you-go)
operating system: linux CentOS 6.8
Cpu: 1 core
memory: 1G
hard disk: 40G
ip: 39.108.77.250
2. Configuration selection
JDK:1.8 (jdk-8u144-linux-x64.tar.gz)
Hadoop:2.8.2 (hadoop-2.8.2.tar.gz)
3. Download address
官网地址:
JDK:
http://www.oracle.com/technetwork/java/javase/downloads
Hadopp:
http://www.apache.org/dyn/closer.cgi/hadoop/common
Baidu cloud:
Link: http://pan.baidu.com/s/1pLqS4kF Password: yb79
Second, the relevant configuration of the server
Before configuring Hadoop, you should do the following configuration
1. Change the hostname
Change the hostname first for ease of administration.
enter:
hostname
Check the machine's name
and enter:
vim /etc/sysconfig/network
Modify the host name
Change the name of HOSTNAME to the name you want to set
Note: After the host name is changed, a reboot will take effect.
enter
vim /etc/hosts
Add the host IP and the corresponding host name for mapping.
Note: This mapping must be done when using hostnames in configuration files!
2. Turn off the firewall
Turn off the firewall to facilitate external access.
CentOS 7 version the following input:
Turn off the firewall
service iptables stop
For CentOS 7 and above, enter:
systemctl stop firewalld.service
3. Time setting
enter:
date
Check whether the server time is consistent, if not, change it.
Change time command
date -s ‘MMDDhhmmYYYY.ss’
3. Hadoop environment installation
1. Download jdk, hadoop
Put the downloaded jdk and hadoop decompression packages in the home directory
and create new java and hadoop folders
3.1.1 Unzip the file
Input:
tar -xvf jdk-8u144-linux-x64.tar.gz
tar -xvf hadoop-2.8.2.tar.gz
Extract jdk and hadoop, move the files to java and hadoop respectively,
and rename the folder to jdk1.8 and hadoop2.8
2, JDK environment configuration
first enter
java -version
Check if JDK is installed, if installed, but the version is not suitable, uninstall
3.2.1 Profile file changes
Edit the /etc/profile file
by typing:
vim /etc/profile
Overall configuration file:
export JAVA_HOME=/home/java/jdk1.8
export JRE_HOME=/home/java/jdk1.8/jre
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib
export PATH=.:${JAVA_HOME}/bin:$PATH
Note: The path of JAVA_HOME is the path of JDK
Remember to enter after editing with vim
source /etc/profile
make the configuration take effect
After configuration, enter to java -version
view version information
3. Hadoop environment configuration
3.3.1 Profile file changes
Edit the /etc/profile file
by typing:
vim /etc/profile
Overall configuration file:
export HADOOP_HOME=/home/hadoop/hadoop2.8
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
export PATH=.:${JAVA_HOME}/bin:${HADOOP_HOME}/bin:$PATH
enter:
source /etc/profile
make the configuration take effect
Modify the configuration file
3.3.2 Create a new folder
Before modifying the configuration file, now create some folders in the root directory.
enter:
mkdir /root/hadoop
mkdir /root/hadoop/tmp
mkdir /root/hadoop/var
mkdir /root/hadoop/dfs
mkdir /root/hadoop/dfs/name
mkdir /root/hadoop/dfs/data
Note: Create a new folder in the root directory to prevent it from being deleted inexplicably.
First switch to the /home/hadoop/hadoop2.8/etc/hadoop/ directory
3.3.3 Modify core-site.xml
enter:
vim core-site.xml
In <configuration>
adding:
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/root/hadoop/tmp</value>
<description>Abase for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://test1:9000</value>
</property>
</configuration>
Note: test1 can be replaced with the ip of the host.
3.3.4 Modify hadoop-env.sh
enter:
vim hadoop-env.sh
Change it ${JAVA_HOME}
to your own JDK path
export JAVA_HOME=${JAVA_HOME}
change into:
export JAVA_HOME=/home/java/jdk1.8
3.4.5 Modify hdfs-site.xml
enter:
vim hdfs-site.xml
In <configuration>
adding:
<property>
<name>dfs.name.dir</name>
<value>/root/hadoop/dfs/name</value>
<description>Path on the local filesystem where theNameNode stores the namespace and transactions logs persistently.</description>
</property>
<property>
<name>dfs.data.dir</name>
<value>/root/hadoop/dfs/data</value>
<description>Comma separated list of paths on the localfilesystem of a DataNode where it should store its blocks.</description>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
<description>need not permissions</description>
</property>
Description: After dfs.permissions is configured to false, it is allowed to generate files on dfs without checking permissions, which is convenient, but you need to prevent accidental deletion, please set it to true, or delete the property node directly, because Default is true.
3.4.6 Modify mapred-site.xml
If mapred-site.xml
the file doesn't exist, copy the mapred-site.xml.template
file and rename it to mapred-site.xml
.
enter:
vim mapred-site.xml
Modify this newly created mapred-site.xml
file and <configuration>
add the configuration to the node:
<property>
<name>mapred.job.tracker</name>
<value>test1:9001</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/root/hadoop/var</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
Command used:
At this point, the configuration of Hadoop's stand-alone mode is completed.
4. Hadoop startup
The first time you start Hadoop, you need to initialize it
. Switch to the /home/hadoop/hadoop2.8/bin directory and enter
./hadoop namenode -format
After the initialization is successful, you can add a current directory and some files in the /root/hadoop/dfs/name directory (the path is configured accordingly in the hdfs-site.xml file, and the folder is newly created).
Start Hadoop Mainly start HDFS and YARN
switch to /home/hadoop/hadoop2.8/sbin directory
Start HDFS
input:
start-dfs.sh
The login will ask whether to connect, enter yes, and then enter the password.
Start YARN
input:
start-yarn.sh
You can enter jps to see if it started successfully
Type in the browser:
http://39.108.77.250:8088/cluster
Input:
http://39.108.77.250:50070
Note: I forgot to take a screenshot at that time. This is a picture of using a local virtual machine to build the environment.
At this point, the single-machine configuration of Hadoop is over.