Hadoop pseudo-cluster installation of hadoop study notes

Hadoop installation is actually very simple. The key is not to step on some pits. Hadoop is actually a java program. It is very simple to run. The pre-configuration and preparation work can be successfully installed in one step.

preparation 1

Install a linux virtual machine, pay attention not to forget the network settings when installing, otherwise you cannot connect to the virtual machine through the LAN

preparation 2

Linux must first create a user to run hadoop and give permissions

[root@ ~]# useradd -m hadoop -s /bin/bash

[root@ ~]# passwd hadoop 

Change the password for user hadoop.

new password:

Re-enter the new password:

passwd: All authentication tokens have been successfully updated.

Grant permissions to hadoop user

Modify the /etc/sudoers file, find the following line, and remove the previous comment (#)

## Allows people in group wheel to run all commands
%wheel    ALL=(ALL)    ALL

Then modify the user to belong to the root group (wheel), the command is as follows:

#usermod -g root hadoop

After the modification is completed, you can now log in with the hadoop account, and then use the command sudo to obtain root privileges to operate.

Preparation 3

[root@ ~]# su hadoop

[hadoop@ root]$ 

[hadoop@ root]$  ssh-keygen -t rsa -P '' 

#The key is stored in the /home/hadoop/.ssh/ directory by default

[hadoop@ root]$ ~$ cat ./.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

[hadoop@ root]$  chmod 0600 !$

chmod 0600 ~/.ssh/authorized_keys

try

[hadoop@ root]$ ssh localhost

Last login: Sat Mar 25 21:04:52 2017

[hadoop@ ~]$ 

You can log in without a password

 

Preparation 4:

install jdk

[hadoop@ ~]$ cat  .bash_profile 

export JAVA_HOME=/usr/java/jdk1.7.0_79

export PATH=$JAVA_HOME/bin:$PATH 

export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar 

 

Ready to work

 

I downloaded 2.7.3

wget http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz

decompress

tar -zxvf hadoop-2.7.3.tar.gz

 

Hadoop configuration 

First enter /home/hadoop/hadoop-2.7.3/etc

 

core-site.xml: includes HDFS, MapReduce I/O, and the url (protocol, host name, port) of the namenode node and other core configurations. After the datanode is registered on the namenode, it interacts with the client through this url

 

 vi hadoop-2.7.3/etc/hadoop/core-site.xml 

<configuration>

        <property>

             <name>fs.defaultFS</name>

             <value>hdfs://localhost:9000</value>

        </property>

</configuration>

 

hdfs-site.xml: HDFS daemon configuration, including namenode, secondary namenode, datanode

hdfs-site.xml 

<configuration>

        <property>

             <name>dfs.replication</name>

             <value>1</value>

        </property>

</configuration>

mapred-site.xml: MapReduce daemon configuration, including jobtracker and tasktrackers

 

 vi mapred-site.xml (can be copied from template)

<configuration>

        <property>

             <name>mapreduce.framework.name</name>

             <value>yarn</value>

        </property>

</configuration>

Global resource management configuration (not very understanding for the time being)

http://www.cnblogs.com/gw811/p/4077318.html

vi yarn-site.xml 

<configuration>

   <property>

      <name>yarn.nodemanager.aux-services</name> 

      <value>mapreduce_suffle</value>

   </property> 

</configuration>

Configure variables related to the hadoop runtime environment

 

hadoop@hive:~$ vi  hadoop-env.sh

export JAVA_HOME=/usr/java/jdk1.7.0_79 

 

There is one more important step to complete these:

The nameNode is formatted and started. If the hostname is modified, the /etc/hosts file also needs to add local resolution, otherwise the initialization will report an error namenode unknown

/hadoop-2.7.3/bin/hdfs purpose -format

 

Ready to start the command is:

Go to /hadoop-2.7.3/sbin/

./start-all.sh

You can enter the log directory to see the log

 

/home/hadoop/hadoop-2.7.3/logs

 

[root@ sbin]# jps

14505 SecondaryNameNode

14305 NameNode

12108 -- process information unavailable

14644 ResourceManager

14390 DataNode

14736 NodeManager

14769 Jps

 

[root@ sbin]# 

 

Hadoop pseudo-cluster installation is complete

 

 

 

 

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326416391&siteId=291194637