Hadoop [Lesson 3]: Install Hadoop on a virtual machine (pseudo-distributed)

1. Download Hadoop

Hadoop can be downloaded from the official website download address , as shown in the following figure:
Insert picture description here
If you will use Hbase in the future, it is recommended not to choose a newer version, or you can check Hadoop and their version support relationship!

Mirror address (download the old version)

What I downloaded here is Hadoop version 2.7.7!

2. Transfer the file to the virtual machine

Reference: [VMware] Xshell remote terminal to transfer files
Reference: XShell upload files to the virtual machine

3. Modify the configuration file

① First create a new folder for Hadoop

# mkdir /usr/local/src/hadoop

②Unzip Hadoop to this folder

# tar -zxvf hadoop-2.7.7.tar.gz -C /usr/local/src/hadoop

③Enter into the folder directory where the configuration file is located

# cd /usr/local/src/hadoop/hadoop-2.7.7/etc/hadoop/ 

④Check the files in this directory

# ls

Insert picture description here

(1) Configure the JDK path for Hadoop

# vi hadoop-env.sh

The revised content is:

export JAVA_HOME=/usr/local/java/jdk1.8.0_221

(2) Configure HDFS system

# vi core-site.xml

Add the following content:

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/usr/local/src/hadoop/hadoop-2.7.7/tmp</value>
    </property>
</configuration>

(3) Configure pseudo distributed

①Create a folder first

# mkdir /usr/local/src/hadoop/hadoop-2.7.7/hdfs

# mkdir /usr/local/src/hadoop/hadoop-2.7.7/hdfs/data

# mkdir /usr/local/src/hadoop/hadoop-2.7.7/hdfs/name

# vi hdfs-site.xml

Add the following content:

<configuration>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:/usr/local/src/hadoop/hadoop-2.7.7/hdfs/name</value>
    </property>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:/usr/local/src/hadoop/hadoop-2.7.7/hdfs/data</value>
    </property>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

(4) Configure Hadoop's mapreduce to run on yarn

① Modify the file name first

# mv mapred-site.xml.template mapred-site.xml
# vi mapred-site.xml 

Add the following content:

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

(5) Configure yarn

# vi yarn-site.xml
<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.hostname</name>
        <value>localhost</value>
    </property>
</configuration>

Guess you like

Origin blog.csdn.net/qq_41315788/article/details/109264324