hadoop fully distributed deployment

A. Prepare the experimental environment
  need to prepare four Linux server operating system, configuration parameters, like the best, because of my pseudo-distributed virtual machine is deployed from before, so I have the same environment, and every day is a default virtual machine Hadoop pseudo-distributed yo!
1> .NameNode server (172.20.20.228)

2> .DataNode server (172.20.20.226-220)

  II. Modify Hadoop configuration file

  Modify the configuration file path before me a copy of the full directory, an absolute path is: "/ tosp / opt / hadoop", after modifying the files in this directory, we will come to hadoop directory connection, when you need to pseudo-distributed or local mode only when the need to change the directory to soft connection point, thus easy three models profiles peace with the situation.

1> .core-site.xml configuration file

Copy the code
[root@cdh14 ~]$ more /tosp/opt/hadoop/etc/hadoop/core-site.xml 
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
                <property>
                        <name>fs.defaultFS</name>
                        <value>hdfs://cdh14:9000</value>
                </property>
                <property>
                        <name>hadoop.tmp.dir</name>
                        <value>/tosp/opt/hadoop</value>
                </property>
</configuration>

<!--

The role of core-site.xml configuration file:
    Used to define system-level parameters, such as HDFS URL, Hadoop temporary
And means for configuring the directory profile rack-aware cluster like reference herein
The number of defined core-default.xml will override the default configuration file.

The role of fs.defaultFS parameters:
        # Declaration namenode address, equivalent to the statement hdfs file system.

The role of hadoop.tmp.dir parameters:
        # Declare address hadoop working directory.

-->
[root@cdh14 ~]$ 
Copy the code

2> .hdfs-site.xml profile

Copy the code
[root@cdh14 ~]$ more /tosp/opt/hadoop/etc/hadoop/hdfs-site.xml 
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
        <property>
                <name>dfs.replication</name>
                <value>2</value>
        </property>
</configuration>

<!--
Role hdfs-site.xml configuration file:
        #HDFS related settings such as the number of copies of a file, and whether to use the block size force permissions
And the like, of which the parameter defines hdfs-default.xml override the default configuration file.

The role of dfs.replication parameters:
        # For data availability and redundancy purposes, HDFS will save the same data on multiple nodes
Multiple copies of blocks, which defaults to 3. The pseudo-distributed environment in which only one node only
It can save a copy, which can be defined by dfs.replication property. It is a
Level backup software.

-->
[root@cdh14 ~]$ 
Copy the code

3> .mapred-site.xml configuration file

Copy the code
[root@cdh14 ~]$ more /tosp/opt/hadoop/etc/hadoop/mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
        <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
        </property>
</configuration>

<!--
mapred-site.xml configuration file the action:
        #HDFS default settings related to the number, such as reduce the task, the memory can be used
The lower limits, the parameters defined herein mapred-default.xml overrides the default file
default allocation.

The role of mapreduce.framework.name parameters:
        # Specify MapReduce computational framework, there are three options, the first one: local (local), the
Two kinds are classic (hadoop generation implementation framework), the third is the yarn (II implementation framework), I
Here are configured with the current version of the latest computing framework yarn can be.

-->
[root@cdh14 ~]$ 
Copy the code

4> .yarn-site.xml profile

Copy the code
[root@cdh14 ~]$ more /tosp/opt/hadoop/etc/hadoop/yarn-site.xml 
<?xml version="1.0"?>
<configuration>
                <property>
                        <name>yarn.resourcemanager.hostname</name>
                        <value>cdh14</value>
                </property>
        <property>
                <name>yarn.nodemanager.aux-services</name>
                <value>mapreduce_shuffle</value>
        </property>
</configuration>

<!--

Effect yarn-site.xml configuration file:
        # Is used to configure scheduler level parameters.
The role of yarn.resourcemanager.hostname parameters:
        # Specifies the resource manager (resourcemanager) hostname
Effect yarn.nodemanager.aux-services parameters:
        # Specify nodemanager use shuffle

-->
[root@cdh14 ~]$ 
Copy the code

 5> .slaves profile

[root@cdh14 ~]$ more /tosp/opt/hadoop/etc/hadoop/slaves 
# The role of the configuration file: NameNode is used to record what needs to DataNode server nodes connected by sending remote commands and instructions when starting or stopping service destination host.
cdh14 
cdh12
cdh11
cdh10
cdh9
cdh8
cdh7 [root@cdh14 ~]$

 

3. In the NameNode node configuration password-free login nodes each DataNode

1> Generate public and private keys on the local pair (before generation, the delete keys last deployment pseudo-distributed)

Copy the code
[root@cdh14 ~]$ rm -rf ~/.ssh/*
[root@cdh14 ~]$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
Generating public/private rsa key pair.
Your identification has been saved in /home/root/.ssh/id_rsa.
Your public key has been saved in /home/root/.ssh/id_rsa.pub.
The key fingerprint is:
a3:a4:ae:d8:f7:7f:a2:b6:d6:15:74:29:de:fb:14:08 root@cdh14
The key's randomart image is:
+ - [RS 2048] ---- +
|             .   |
| And |
| o =. |
| oo. |
|      . S  . . . |
| O . ... . |
| . ... the |
| o .. oo. . |
|. yes. +++. or |
+-----------------+
[root@cdh14 ~]$ 
Copy the code

2> using ssh-copy-id command to assign the public key DataNode server (172.20.20.228)

Copy the code
[root@cdh14 ~]$ ssh-copy-id root@cdh14
The authenticity of host 'cdh14 (172.16.30.101)' can't be established.
ECDSA key fingerprint is fa:25:bc:03:7e:99:eb:12:1e:bc:a8:c9:ce:39:ba:7b.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@cdh14's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'root@cdh14'"
and check to make sure that only the key(s) you wanted were added.

[root@cdh14 ~]$ ssh cdh14
Last login: Fri May 25 18:35:40 2018 from 172.16.30.1
[root@cdh14 ~]$ who
root pts/0        2018-05-25 18:35 (172.16.30.1)
root pts/1        2018-05-25 19:17 (cdh14)
[root@cdh14 ~]$ exit 
logout
Connection to cdh14 closed.
[root@cdh14 ~]$ who
root pts/0        2018-05-25 18:35 (172.16.30.1)
[root@cdh14 ~]$ 
Copy the code

3> using ssh-copy-id command to assign the public key DataNode server (172.20.20.226-220)

Copy the code
[root@cdh14 ~]$ ssh-copy-id root@chd12-cdh7
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@s102's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'root@s102'"
and check to make sure that only the key(s) you wanted were added.

[root@cdh14 ~]$ ssh s102
Last login: Fri May 25 18:35:42 2018 from 172.16.30.1
[root@s102 ~]$ who
root pts/0        2018-05-25 18:35 (172.16.30.1)
root pts/1        2018-05-25 19:19 (cdh14)
[root@s102 ~]$ exit 
logout
Connection to s102 closed.
[root@cdh14 ~]$ who
root pts/0        2018-05-25 18:35 (172.16.30.1)
[root@cdh14 ~]$ 
Copy the code

 

  Note: The above is a common sign in the dense configuration-free, consistent root user to configure, and preferably also configure the root user login-free dense, because later I will perform the appropriate shell script.

V. start the service and verify success

1> format file system

root@cdh14 ~]$ hdfs namenode -format

2> Start hadoop

[root@cdh14 ~]$ start-all.sh

3> Script verify whether NameNode and DataNode normal start with a custom

[root@cdh14 ~]$ jps 

Guess you like

Origin www.cnblogs.com/yangtao481/p/12120070.html