Four hadoop cluster deployment

1. Preparing the environment

CentOS 7.4 

hadoop hadoop-3.2.1 (http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz

jdk 1.8.x

2, configure the environment variables

Command: vi / etc / profile

#hadoop

export HADOOP_HOME=/opt/module/hadoop-3.2.1
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop

Commands :: wq 

Command: Source / etc / Profile (execute this command to refresh the configuration file)

3, the new directory

(Performed separately)

mkdir /root/hadoop
mkdir /root/hadoop/tmp
mkdir /root/hadoop/var
mkdir /root/hadoop/dfs
mkdir /root/hadoop/dfs/name
mkdir /root/hadoop/dfs/data

4, modify the configuration etc / hadoop

(1), modified core-site.xml

 Add disposed in <configuration> node:

<property>
        <name>hadoop.tmp.dir</name>
        <value>/root/hadoop/tmp</value>
        <description>Abase for other temporary directories.</description>
   </property>
   <property>
        <name>fs.default.name</name>
        <value>hdfs://node180:9000</value>
   </property>

(2) modifying hdfs-site.xml

 Add disposed in <configuration> node:

<property>
<!-- 主节点地址 -->
<name>dfs.namenode.http-address</name>
<value>node180:50070</value>
</property>
<property>
   <name>dfs.name.dir</name>
   <value>/root/hadoop/dfs/name</value>
   <description>Path on the local filesystem where theNameNode stores the namespace and transactions logs persistently.
</description>
</property>

<property>
   <name>dfs.data.dir</name>
   <value>/root/hadoop/dfs/data</value>
   <description>Comma separated list of paths on the localfilesystem of a DataNode where it should store its blocks.
</description>
</property>

<property>
   <name>dfs.replication</name>
   <value>2</value>
</property>

<property>
   <name>dfs.permissions</name>
   <value>false</value>
  <description>need not permissions</description>
</property>

 

After dfs.permissions configured as false, do not be allowed to check the permissions on the files generated on the dfs, convenient touches easily, but you need to prevent accidental deletion, set it to true, or directly delete the property node, because the default is true.

(3) modifying mapred-site.xml

 Add disposed in <configuration> node:

<! - Configuration mapReduce run (running locally default) on the Yarn ->
<Property>
<name> mapreduce.framework.name </ name>
<value> Yarn </ value>
</ Property>

<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=/opt/module/hadoop-3.2.1</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=/opt/module/hadoop-3.2.1</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=/opt/module/hadoop-3.2.1</value>
</property>

(4), to modify yarn-site.xml

Add disposed in <configuration> node:

<!-- Site specific YARN configuration properties -->

<Property>
<the Description> Specifies the YARN boss (ResourceManager) address </ the Description>
        <name> yarn.resourcemanager.hostname </ name>
        <value> node180 </ value>
   </ Property>

<-! runs on NodeManager ancillary services. Needs to be configured to mapreduce_shfffle, before running MapReduce Defaults ->
   <Property>
        <name> yarn.nodemanager.aux-Services </ name>
        <value> mapreduce_shuffle </ value>
   </ Property>

<-!
   <Property>
<discription> each node available memory unit in MB, default 8182MB </ discription>
        <name> yarn.scheduler.maximum-Allocation-MB </ name>
        <value> 1024 </ value>
   < / Property>
->



   <property>
        <name>yarn.nodemanager.vmem-check-enabled</name>
        <value>false</value>
</property>

Description: yarn.nodemanager.vmem-check-enabled this means ignoring virtual memory check, if you are installing on a virtual machine, this configuration is useful, coupled with follow-up operations to go after not easy to go wrong. If it is the physical machine, and enough memory, the configuration may be removed.

 

(5), workers file

Read:

node180

node181

node182

(6)、修改 hadoop-env.sh、mapred-env.sh、yarn-env.sh

Join jdk path configuration

# jdk
export JAVA_HOME="/opt/module/jdk1.8.0_161"

 

5, modify sbin

(1), modified start-dfs.sh, stop-dfs.sh

The first line added

HDFS_DATANODE_USER=root
HDFS_DATANODE_SECURE_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root

 

(2), modify stop-dfs.sh, stop-yarn.sh

YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root

 

6, file synchronization, each node

(1) folder synchronization hadoop

 scp -r hadoop-3.2.1/    [email protected]:/opt/module

 scp -r hadoop-3.2.1/    [email protected]:/opt/module

(2) synchronous data folder

 scp -r /root/hadoop/   [email protected]:/root

 scp -r /root/hadoop/    [email protected]:/root

6, start hadoop

(1), initialization is performed on namenode

Open the folder: cd /opt/module/hadoop-3.2.1/bin

Execute the command: ./ hadoop namenode -format

(2), the implementation started on namenode

Open the folder: cd /opt/module/hadoop-3.2.1/sbin

Execute the command: ./ start-all.sh

7, test hadoop

https://blog.csdn.net/weixin_38763887/article/details/79157652

https://blog.csdn.net/s1078229131/article/details/93846369

Open: http: //192.168.0.180: 50070 /

 

 

Open: http: //192.168.0.180: 8088 /

 

 

 

8, test analysis

Create a folder: hdfs dfs -mkdir -p / user / root

Upload word documents to hadoop server: wc.txt

Execute the command: hdfs dfs -put /root/wc.txt 

Perform word command: hadoop jar /opt/module/hadoop-3.2.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.1.jar wordcount wc.txt wcount

View the results of the command: hdfs dfs -cat wcount / *

 

The effect is as:

 

 

 

Guess you like

Origin www.cnblogs.com/qk523/p/12450215.html