Environment Configuration
ubuntu16.04, jdk1.8.0_242,hadoop2.7.1
step
1. Create hadoop user, and give administrator privileges
Login linux, enter in a terminal:
sudo useradd -m hadoop -s /bin/bash
sudo passwd hadoop
sudo adduser hadoop sudo
2. Switch to the user hadoop (subsequent operations were carried out hadoop user), and updates apt
sudo su hadoop
sudo apt-get update
3. Install ssh, login dense set free
sudo apt-get install openssh-server
Create a new key:
ssh-keygen -t rsa -P ""
Enter the path within the brackets when given the option or just press Enter:
The generated key is added to the authorization and connection using ssh localhost:
cat /home/hadoop/.ssd/id_rsa.pub >> /home/hadoop/.ssh/authorized_keys
ssh localhost
4. Install JDK [. 3]
sudo apt-get update
sudo apt-get install openjdk-8-jdk
java -version
Here easy installation, configuration environment variable reference [4] .
5. Installation hadoop
Select the corresponding version download:
https://archive.apache.org/dist/hadoop/common/
Here I chose hadoop-2.7.1.tar.gz:
In the new place to store the files you want to unzip the file folders, such as / opt / hadoop:
sudo mkdir hadoop
Open a terminal and enter the directory where the compressed files:
sudo tar -zxf hadoop-2.7.1.tar.gz -C /opt/hadoop # 解压至路径/opt/hadoop下
cd /opt/hadoop
sudo mv ./hadoop-2.7.1/ ./hadoop # 修改文件夹名称
sudo chown -R hadoop ./hadoop # 修改权限
6. hadoop pseudo distributed configuration
(1) adding an environment variable:
vim ~/.bashrc
If the situation appears as shown below, need to install vim:
sudo apt install vim
bashrc file and then open the Add at the end:
export HADOOP_HOME=/opt/hadoop/hadoop
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export PATH=$PATH:$HADOOP_HOME/bin
Wherein, vim using the method of Reference [5] ;
JAVA_HOME path jdk installation path, such as path do not know, refer to [6] acquired.
Execute the following command to make the changes take effect without rebooting:
source ~/.bashrc
(2) Associate HDFS:
vim /opt/hadoop/hadoop/etc/hadoop/hadoop-env.sh
JAVA_HOME is added at the corresponding position: (this step should not be necessary)
Modify the configuration file core-site.xml:
cd /opt/hadoop/hadoop/etc/hadoop
vim core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/opt/hadoop/hadoop/tmp</value>
<description>Abase for other temporary directories.</description>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
Modify hdfs-site.xml:
vim hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/opt/hadoop/hadoop/tmp/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/opt/hadoop/hadoop/tmp/dfs/data</value>
</property>
</configuration>
(3) Map Reduce Configuration
Modify mapred-site.xml:
vim mapred-site.xml
<configuration>
<property>
<name>mapreduce.jobtracker.address</name>
<value>localhost:9001</value>
</property>
</configuration>
(4) formatting hdfs
cd /opt/hadoop/hadoop
./bin/hdfs namenode -format
Figure is lower success appears:
Open hdfs and yarn, yes selected intermediate options appear:
./sbin/start-dfs.sh
./sbin/start-yarn.sh
jps # 查看是否启动成功
Close process:
./sbin/stop-dfs.sh
./sbin/stop-yarn.sh
At this point the configuration is successful, the follow-up went on to add.
Reference Documents
[1] Ubuntu16.04下安装Hadoop2.7.4伪分布式环境
[2] Ubuntu16.04+hadoop2.7.3环境搭建