Linux virtual machine installation hadoop

Table of contents

1 hadoop download

2 unzip hadoop

3 Rename the hadoop folder

4 Authorize the hadoop folder 

5 Modify environment variables

6 Refresh environment variables

7 Create folder data in hadoop313 directory

8 Check files

 9 Edit the ./core-site.xml file

 10 Edit the ./hadoop-env.sh file

11 Edit the ./hdfs-site.xml file

 12 Edit the ./mapred-site.xml file

13 Edit the ./yarn-site.xml file

14 Edit the ./workers file

15 initialization

16 Configure password-free login

17 Start and shut down hadoop

18 test hadoop


1 hadoop download

Hadoop3.1.3 network disk resources are as follows:

Link: https://pan.baidu.com/s/1a2fyIUABQ0e-M8-T522BjA?pwd=2jqu Extraction code: 2jqu

2 unzip hadoop

Unzip the hadoop archive into the /opt/soft directory

tar -zxf ./hadoop-3.1.3.tar.gz -C /opt/soft/

Check if it has been extracted to the /opt/soft directory

ls /opt/soft

3 Rename the hadoop folder

Change hadoop-3.1.3/ to hadoop313

mv hadoop-3.1.3/ hadoop313

4 Assign group to hadoop folder 

chown -R root:root ./hadoop313/

5 Modify environment variables

# HADOOP_HOME
export HADOOP_HOME=/opt/soft/hadoop313
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HADOOP_HOME/lib
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export HDFS_JOURNALNODE_USER=root
export HDFS_ZKFC_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_LIBEXEC_DIR=$HADOOP_HOME/libexec
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

6 Refresh environment variables

source /etc/profile

7 Create folder data in hadoop313 directory

mkdir ./data

8 Check files

Check whether there are the following files in the /opt/soft/hadoop313/etc/hadoop path

 9 Edit the ./core-site.xml file

vim ./core-site.xml

Add the following content in <configuration></configuration>

Pay attention to whether your hostname is consistent, and whether you have done domain name mapping

Domain name mapping can refer to Linux installation configuration Oracle+plsql installation configuration (detailed)_sqlplus installation_Super Love Slow Blog-CSDN Blog

first few steps

    <property>
      <name>fs.defaultFS</name>
      <value>hdfs://kb129:9000</value>
    </property>
    <property>
      <name>hadoop.tmp.dir</name>
      <value>/opt/soft/hadoop313/data</value>
    </property>
    <property>
      <name>hadoop.http.staticuser.user</name>
      <value>root</value>
    </property>
    <property>
      <name>io.file.buffer.size</name>
      <value>131073</value>
    </property>
    <property>
      <name>hadoop.proxyuser.root.hosts</name>
      <value>*</value>
    </property>
    <property>
      <name>hadoop.proxyuser.root.groups</name>
      <value>*</value>
    </property>

 10 Edit the ./hadoop-env.sh file

Find the commented export JAVA_HOME or rewrite one directly

Pay attention to whether your JAVA_HOME path is consistent with this article

vim ./hadoop-env.sh
export JAVA_HOME=/opt/soft/jdk180

11 Edit the ./hdfs-site.xml file

vim ./hdfs-site.xml

Add the following content in <configuration></configuration>

    <property>
      <name>dfs.replication</name>
      <value>1</value>
    </property>
    <property>
      <name>dfs.namenode.name.dir</name>
      <value>/opt/soft/hadoop313/data/dfs/name</value>
    </property>
    <property>
      <name>dfs.datanode.data.dir</name>
      <value>/opt/soft/hadoop313/data/dfs/data</value>
    </property>
    <property>
      <name>dfs.permissions.enabled</name>
      <value>false</value>
    </property>

 12 Edit the ./mapred-site.xml file

vim ./mapred-site.xml

Add the following content in <configuration></configuration>

Note your hostname

    <property>
      <name>mapreduce.framework.name</name>
      <value>yarn</value>
    </property>
    <property>
      <name>mapreduce.jobhistory.address</name>
      <value>kb129:10020</value>
    </property>
    <property>
      <name>mapreduce.jobhistory.webapp.address</name>
      <value>kb129:19888</value>
    </property>
    <property>
      <name>mapreduce.map.memory.mb</name>
      <value>4096</value>
    </property>
    <property>
      <name>mapreduce.reduce.memory.mb</name>
      <value>4096</value>
    </property>
    <property>
      <name>mapreduce.application.classpath</name>
      <value>/opt/soft/hadoop313/etc/hadoop:/opt/soft/hadoop313/share/hadoop/common/lib/*:/opt/soft/hadoop313/share/had    oop/common/*:/opt/soft/hadoop313/share/hadoop/hdfs/*:/opt/soft/hadoop313/share/hadoop/hdfs/lib/*:/opt/soft/hadoop313/    share/hadoop/mapreduce/*:/opt/soft/hadoop313/share/hadoop/mapreduce/lib/*:/opt/soft/hadoop313/share/hadoop/yarn/*:/op    t/soft/hadoop313/share/hadoop/yarn/lib/*</value>
    </property>

13 Edit the ./yarn-site.xml file

vim ./yarn-site.xml

Add the following content in <configuration></configuration>

Pay attention to whether your hostname (hostname) is consistent

    <property>
      <name>yarn.resourcemanager.connect.retry-interval.ms</name>
      <value>20000</value>
    </property>
    <property>
      <name>yarn.resourcemanager.scheduler.class</name>
      <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
    </property>
    <property>
      <name>yarn.nodemanager.localizer.address</name>
      <value>kb129:8040</value>
    </property>
    <property>
      <name>yarn.nodemanager.address</name>
      <value>kb129:8050</value>
    </property>
    <property>
      <name>yarn.nodemanager.webapp.address</name>
      <value>kb129:8042</value>
    </property>
   <property>
      <name>yarn.nodemanager.aux-services</name>
      <value>mapreduce_shuffle</value>
    </property>
  
    <property>
      <name>yarn.nodemanager.local-dirs</name>
      <value>/opt/soft/hadoop313/yarndata/yarn</value>
    </property>
  
    <property>
      <name>yarn.nodemanager.log-dirs</name>
      <value>/opt/soft/hadoop313/yarndata/log</value>
    </property>
  
  
    <property>
      <name>yarn.nodemanager.vmem-check-enabled</name>
      <value>false</value>
    </property>

14 Edit the ./workers file

vim ./workers

 Replace the content inside with your hostname (hostname)

like:

kb129

15 initialization

hadoop namenode -format

Seeing the following content means successful initialization

16 Configure password-free login

Back to home directory

ssh-keygen -t rsa -P ""

enter after enter

The following screen will appear

Check if there is a .ssh file

ll -a

 Configure password-free login

cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys

Test password-free login (ssh connects to itself)

ssh -p 22 root@kb159

If you do not need to enter a password, the configuration is successful

The first connection will have the option to enter yes or no

In the future, the connection will not have such a prompt, and the connection will be successful directly.

After the connection is successful, return to the machine and enter exit and press Enter

If you are two different virtual machines and want to use ssh to connect, you need to execute the following command in each of the two virtual machines

(The hostname here is another virtual machine)

ssh-copy-id -i ~/.ssh/id_rsa.pub -p22 root@kb128

17 Start and shut down hadoop

start hadoop

start-all.sh 

 shutdown hadoop

stop-all.sh 

18 test hadoop

Enter jps and the following six messages will appear

 Enter the URL http://192.168.153.129:9870/ in the browser and the page will appear ( be careful to replace your own IP address )

 Or check the hadoop version

hadoop version

Guess you like

Origin blog.csdn.net/jojo_oulaoula/article/details/132452610