Hadoop cluster installation

illustrate

Note 1: This is the second part of the big data part. For the first one, see the link https://blog.csdn.net/focuson_/article/details/80153371 . The installation preparation instructions for the machine and the installation of zookeeper have been described in the previous section. explained in the blog post.

Note 2: This article is the installation of hadoop, and the cluster distribution is designed as:

machine

install software

process

focuson1

zookeeper,hadoop namenode,hadoop DataNode

JournalNode; DataNode; QuorumPeerMain; NameNode; DFSZKFailoverController; NodeManager

focuson2

zookeeper;hadoop namenode,hadoop DataNode;yarn

JournalNode; DataNode; QuorumPeerMain; NameNode; DFSZKFailoverController;NodeManager;ResourceManager

focuson3

zookeeper,hadoop DataNode;yarn

JournalNode; DataNode; QuorumPeerMain;NodeManager;ResourceManager


installation steps:

1. Upload the compressed package to the focuson1 home directory

cd/usr/local/src/
mkdir hadoop
mv~/hadoop-2.6.0.tar.gz .
tar -xvfhadoop-2.6.0.tar.gz
rm -fhadoop-2.6.0.tar.gz

2. Modify the configuration file

1》hadoop-env.sh

exportJAVA_HOME=/usr/local/src/java/jdk1.7.0_51//Must have

2" yarn and Hadoop integration

2.1 mapred-site.xml

<configuration>
    <configuration>
        <!-- Specify mr frame as yarn mode -->
        <property>
               <name>mapreduce.framework.name</name>
                <value>yarn</value>
        </property>
    </configuration>
</configuration>

2.2 yarn-site.xml

<configuration>

         <!-- Site specific YARNconfiguration properties -->

         <!-- Enable RM high reliability-->

    <property>

      <name>yarn.resourcemanager.ha.enabled</name>

       <value>true</value>

    </property>

    <!-- Specify the clusterid of RM -->

    <property>

      <name>yarn.resourcemanager.cluster-id</name>

       <value>yrc</value>

    </property>

    <!-- Specify the name of the RM-->

    <property>

      <name>yarn.resourcemanager.ha.rm-ids</name>

       <value>rm1,rm2</value>

    </property>

    <!-- Specify the address of the RM respectively -->

    <property>

      <name>yarn.resourcemanager.hostname.rm1</name>

       <value>focuson2</value>

    </property>

    <property>

      <name>yarn.resourcemanager.hostname.rm2</name>

       <value>focuson3</value>

    </property>

    <!-- Specify the zk cluster address -->

    <property>

      <name>yarn.resourcemanager.zk-address</name>

      <value>focuson1:2181,focuson2:2181,focuson3:2181</value>

    </property>

    <property>

      <name>yarn.nodemanager.aux-services</name>

      <value>mapreduce_shuffle</value>

    </property>

</configuration>

3" hdfs-site.xml (port: rpc: 9000; http: 50070)

<configuration>
	<!--meservice is ns1, which needs to be consistent with core-site.xml -->
	<property>
	        <name>dfs.nameservices</name>
	        <value>ns1</value>
	</property>
	<!-- There are two NameNodes under ns1, nn1, nn2 -->
	<property>
	        <name>dfs.ha.namenodes.ns1</name>
	        <value>nn1,nn2</value>
	</property>
	<!-- nn1's RPC address -->
	<property>
	        <name>dfs.namenode.rpc-address.ns1.nn1</name>
	        <value>focuson1:9000</value>
	</property>
	<!-- http communication address of nn1-->
	<property>
	        <name>dfs.namenode.http-address.ns1.nn1</name>
	        <value>focuson1:50070</value>
	</property>
	<!-- nn2's RPC address -->
	<property>
	        <name>dfs.namenode.rpc-address.ns1.nn2</name>
	        <value>focuson2:9000</value>
	</property>
	<!-- http communication address of nn2-->
	<property>
	        <name>dfs.namenode.http-address.ns1.nn2</name>
	        <value>focuson2:50070</value>
	</property>
	<!-- Specify the storage location of the NameNode's metadata on the JournalNode-->
	<property>
	        <name>dfs.namenode.shared.edits.dir</name>
	        <value>qjournal://focuson1:8485;focuson2:8485;focuson3:8485/ns1</value>
	</property>
	<!-- Specify the location where JournalNode stores data on the local disk-->
	<property>
	        <name>dfs.journalnode.edits.dir</name>
	        <value>/usr/local/src/hadoop/hadoop-2.6.0/journal</value>
	</property>
	<!-- Failed to enable automatic switchover of NameNode-->
	<property>
	        <name>dfs.ha.automatic-failover.enabled</name>
	        <value>true</value>
	</property>
	<!-- Configuration failure automatic switching implementation-->
	<property>
	        <name>dfs.client.failover.proxy.provider.ns1</name>
	        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
	</property>
	<!-- Configure the isolation mechanism method, multiple mechanisms are separated by newlines, that is, each mechanism temporarily uses one line -->
	<property>
	        <name>dfs.ha.fencing.methods</name>
	        <value>
	                sshfence
	                shell(/bin/true)
	        </value>
	</property>
	<!-- SSH free login is required when using sshfence isolation mechanism-->
	<property>
	        <name>dfs.ha.fencing.ssh.private-key-files</name>
	        <value>/root/.ssh/id_rsa</value>
	</property>
	<!-- Configure sshfence isolation mechanism timeout time-->
	<property>
	        <name>dfs.ha.fencing.ssh.connect-timeout</name>
	        <value>30000</value>
	</property>
</configuration>

4》core-site.xml

<configuration>
	<!-- Specify the nameservice of hdfs as ns1 -->
	<property>
	        <name>fs.defaultFS</name>
	        <value>hdfs://ns1</value>
	</property>
	<!-- Specify hadoop temporary directory -->
	<property>
	        <name>hadoop.tmp.dir</name>
	        <value>/usr/local/src/hadoop/hadoop-2.6.0/tmp</value>
	</property>
	<!-- Specify the zookeeper address-->
	<property>
	        <name>ha.zookeeper.quorum</name>
	        <value>focuson1:2181,focuson2:2181,focuson3:2181</value>
	</property>
</configuration>

5》、slaves

focuson1
focuson2
focuson3

4. Copy the project to two other machines

scp -r /usr/local/src/hadoopfocuson2:/usr/local/src/

scp -r /usr/local/src/hadoopfocuson3:/usr/local/src/

5. Format the namenode

Execute on focuson1: hdfs namenode format

The tmp folder will be generated in /usr/local/src/hadoop/hadoop-2.6.0 (the path is the configured hadoop.tmp.dir), and the folder will be tested under the path of focuson2.

*If this operation is not performed, an error will be reported

5. Start one: dfs, just execute it on focuson1, it will automatically execute namenode/datanode/journalnode/zkfc

Enter /usr/local/src/hadoop and execute sbin/start-dfs.sh

The output log is as follows:

18/04/28 19:02:36 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [focuson1 focuson2]
focuson1: starting namenode, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-namenode-focuson1.out
focuson2: starting namenode, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-namenode-focuson2.out
focuson1: starting datanode, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-datanode-focuson1.out
focuson2: starting datanode, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-datanode-focuson2.out
focuson3: starting datanode, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-datanode-focuson3.out
Starting journal nodes [focuson1 focuson2 focuson3]
focuson3: starting journalnode, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-journalnode-focuson3.out
focuson1: starting journalnode, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-journalnode-focuson1.out
focuson2: starting journalnode, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-journalnode-focuson2.out
18/04/28 19:03:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting ZK Failover Controllers on NN hosts [focuson1 focuson2]
focuson2: starting zkfc, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-zkfc-focuson2.out
focuson1: starting zkfc, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-zkfc-focuson1.out

Start two: yarn, on focuson2:

Enter /usr/local/src/hadoop and execute sbin/start-dfs.sh

[root@focuson2 hadoop-2.6.0]# ./sbin/start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/yarn-root-resourcemanager-focuson1.out
focuson2: starting nodemanager, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/yarn-root-nodemanager-focuson2.out
focuson3: starting nodemanager, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/yarn-root-nodemanager-focuson3.out
focuson1: starting nodemanager, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/yarn-root-nodemanager-focuson1.out

Execute on focuson3 (only one resourcemanager will be started, for high availability):

[root@focuson3 hadoop-2.6.0]# ./sbin/start-yarn.sh
starting yarn daemons
resourcemanager running as process 4258. Stop it first.
focuson3: nodemanager running as process 4689. Stop it first.
focuson2: nodemanager running as process 5783. Stop it first.
focuson1: nodemanager running as process 7596. Stop it first.

6. Verify. On focuson1, jps:

[root@focuson1 hadoop-2.6.0]# jps
6977 DataNode
7089 JournalNode
7177 DFSZKFailoverController
7596 NodeManager
7790 Jps
4255 QuorumPeerMain
6911 NameNode

On focuson2,

[root@focuson2 hadoop-2.6.0]# jps
6144 Jps
5505 DFSZKFailoverController
2963 QuorumPeerMain
5140 DataNode
5783 NodeManager
5047 NameNode
6056 ResourceManager
5321 JournalNode

On focuson3:

[root@focuson3 hadoop-2.6.0]# jps
5136 Jps
4689 NodeManager
4258 ResourceManager
4419 DataNode
3044 QuorumPeerMain
4504 JournalNode

Log in to the web interface to view:

Log in to the web interface to view:

 

It can be seen that the namenode of focuson2 is standby, and that of focuson1 is active.

Kill the namenode process on focuson1, and you will find that focuson2 is active, as follows:

 
 
[root@focuson1 hadoop-2.6.0]# jps
6977 DataNode
7089 JournalNode
7177 DFSZKFailoverController
7596 NodeManager
7790 Jps
4255 QuorumPeerMain
6911 NameNode
[root@focuson1 hadoop-2.6.0]# kill -9 6911

 

 

7. Operate one:

Execute some commands of hdfs on focuson1:

touch first .txt
hdfs dfs –put first.txt
hdfs dfs –put /ls
......

success!

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325881310&siteId=291194637