Manual mode to build a dual Namenode+Yarn Hadoop cluster (below)

4. Interpretation of NameNode and Yarn basic configuration file functions

There are multiple configuration files involved in the configuration of NameNode and Yarn, and there are many parameters in each configuration file. Therefore, how to set reasonable configuration parameters is the difficulty of deploying a Hadoop cluster. However, the Hadoop cluster has a configuration principle, that is, to rewrite the configuration and override the default, otherwise the default will take effect. That is to say, most of the configuration parameters of Hadoop have default values. If the parameter value is set in the configuration file, the default value will be invalid, otherwise it will take effect.

The existence of this principle makes it unnecessary for us to configure every parameter, but only need to configure some important basic parameters. Therefore, in the following configuration file parameter introduction, only the important basic parameters are described, which are also required parameters. Keep the defaults for those that are not introduced.

Hadoop needs to be configured with a total of five files, namely core-site.xml, hdfs-site.xml, mapred-site.xml, yarn-site.xml and hosts. These configuration files are different depending on the role of the node, and the effective host is also different, but it is recommended to maintain the consistency of all configuration files on each cluster node, which is convenient for later operation and maintenance. The customary practice of operation and maintenance is to copy the configuration file directly to all other nodes in the cluster after all the configuration of the NameNode node is completed.

(1) core-site.xml file

core-site.xml is the core configuration file of the NameNode, which mainly sets the properties of the NameNode, and it only takes effect on the NameNode node. This file has many parameters, but not all parameters need to be set, only some necessary and commonly used parameters need to be set. The setting values ​​of some necessary and commonly used parameters are listed below:

<configuration> 

<property> 
  <name>fs.defaultFS</name> 
  <value>hdfs://bigdata</value> 
</property>

<property> 
  <name>hadoop.tmp.dir</name> 
  <value>/var/tmp/hadoop-${user.name}</value> 
</property>
 

<property> 
  <name>ha.zookeeper.quorum</name> 
  <value>slave001,slave002,slave003</value> 
</property> 

  <property>
    <name>fs.trash.interval</name>
    <value>60</value>
  </property>

</configuration>

Among them, the meaning of each parameter is as follows: 

(2) hdfs-site.xml file

This file is the core configuration file of HDFS. It mainly configures some HDFS-based attribute information of NameNode and DataNode, and it takes effect on NameNode and DataNode. The hdfs-site.xml file has many parameters, but not all parameters need to be set, only the necessary and commonly used parameters need to be set. The setting values ​​of some necessary and commonly used parameters are listed below:

<configuration>
   <property>
        <name>dfs.nameservices</name> 
        <value>bigdata</value> 
    </property> 

    <property> 
        <name>dfs.ha.namenodes.bigdata</name> 
        <value>nn1,nn2</value> 
    </property>


    <property> 
        <name>dfs.namenode.rpc-address.bigdata.nn1</name> 
        <value>namenodemaster:9000</value> 
    </property>
 

    <property> 
        <name>dfs.namenode.rpc-address.bigdata.nn2</name> 
        <value>yarnserver:9000</value> 
    </property>


    <property> 
        <name>dfs.namenode.http-address.bigdata.nn1</name> 
        <value>namenodemaster:50070</value> 
    </property>
 
    <property> 
        <name>dfs.namenode.http-address.bigdata.nn2</name> 
        <value>yarnserver:50070</value> 
    </property>


    <property> 
        <name>dfs.namenode.shared.edits.dir</name> 
        <value>qjournal://slave001:8485;slave002:8485;slave003:8485/bigdata</value> 
    </property> 


<property> 
        <name>dfs.ha.automatic-failover.enabled.bigdata</name> 
        <value>true</value> 
</property> 

<property>
  <name>dfs.client.failover.proxy.provider.bigdata</name>
  <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>


<property>
        <name>dfs.journalnode.edits.dir</name> 
        <value>/data1/hadoop/dfs/jn</value> 
</property>


<property> 
        <name>dfs.replication</name> 
        <value>2</value> 
</property>


<property> 
        <name>dfs.ha.fencing.methods</name> 
        <value>shell(/bin/true)</value> 
</property>


<property>
  <name>dfs.namenode.name.dir</name>
  <value>file:///data1/hadoop/dfs/name,file:///data2/hadoop/dfs/name</value>
  <final>true</final>
</property>


<property>
  <name>dfs.datanode.data.dir</name>
<value>file:///data1/hadoop/dfs/data,file:///data2/hadoop/dfs/data</value>
  <final>true</final>
</property>


<property>
  <name>dfs.block.size</name>
  <value>134217728</value>
</property>


<property>
  <name>dfs.permissions</name>
  <value>true</value>
</property>


<property>
  <name>dfs.permissions.supergroup</name>
  <value>supergroup</value>
</property>


<property>
  <name>dfs.hosts</name>
  <value>/etc/hadoop/conf/hosts</value>
</property>


<property>
  <name>dfs.hosts.exclude</name>
  <value>/etc/hadoop/conf/hosts-exclude</value>
</property>

</configuration>

Among them, the meaning of each parameter is as follows:

(3) The mapred-site.xml file
is the configuration file for MR in the MRv1 version. In the Hadoop3.x version, there are few parameters that need to be configured. The necessary and commonly used parameter settings are listed below value:

<configuration>

<property>
 <name>mapreduce.framework.name</name>
 <value>yarn</value>
</property>

<property>
 <name>mapreduce.jobhistory.address</name>
 <value>yarnserver:10020</value>
</property>

<property>
 <name>mapreduce.jobhistory.webapp.address</name>
 <value>yarnserver:19888</value>
</property>

</configuration>

Among them, the meaning of each parameter is as follows: 

(4) yarn-site.xml file

This file is the core configuration file of the Yarn resource management framework. All the configurations of Yarn are performed in this file. The setting values ​​of some necessary and commonly used parameters are listed below:

<configuration> 

<property> 
    <name>yarn.resourcemanager.hostname</name> 
    <value>yarnserver</value> 
</property>   

<property>
    <name>yarn.resourcemanager.scheduler.address</name>
    <value>yarnserver:8030</value>
</property>

<property>
    <name>yarn.resourcemanager.resource-tracker.address</name>
    <value>yarnserver:8031</value>
</property>

<property>
    <name>yarn.resourcemanager.address</name>
    <value>yarnserver:8032</value>
</property>

<property>
    <name>yarn.resourcemanager.admin.address</name>
    <value>yarnserver:8033</value>
</property>

<property>
    <name>yarn.resourcemanager.webapp.address</name>
    <value>yarnserver:8088</value>
</property>

<property> 
    <name>yarn.nodemanager.aux-services</name> 
    <value>mapreduce_shuffle,spark_shuffle</value> 
</property> 

<property>
    <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>

<property>
      <name>yarn.nodemanager.aux-services.spark_shuffle.class</name>
      <value>org.apache.spark.network.yarn.YarnShuffleService</value>
</property>

<property>
    <name>yarn.nodemanager.local-dirs</name>
    <value>file:///data1/hadoop/yarn/local,file:///data2/hadoop/yarn/local</value>
</property>

<property>
    <name>yarn.nodemanager.log-dirs</name>
    <value>file:///data1/hadoop/yarn/logs,file:///data2/hadoop/yarn/logs</value>
</property>


<property>
    <description>Classpath for typical applications.</description>
     <name>yarn.application.classpath</name>
     <value>
        $HADOOP_CONF_DIR,
        $HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,
        $HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,
        $HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,
        $HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*,
        $HADOOP_HOME/share/hadoop/common/*, $HADOOP_COMMON_HOME/share/hadoop/common/lib/*,
        $HADOOP_HOME/share/hadoop/hdfs/*, $HADOOP_HOME/share/hadoop/hdfs/lib/*,
        $HADOOP_HOME/share/hadoop/mapreduce/*, $HADOOP_HOME/share/hadoop/mapreduce/lib/*,
        $HADOOP_HOME/share/hadoop/yarn/*, $HADOOP_YARN_HOME/share/hadoop/yarn/lib/*,
        $HIVE_HOME/lib/*, $HIVE_HOME/lib_aux/*
     </value>
</property>

<property>
  <name>yarn.nodemanager.resource.memory-mb</name>
    <value>20480</value>
</property>

<property>
    <name>yarn.nodemanager.resource.cpu-vcores</name>
    <value>8</value>
</property>

</configuration>

Among them, the meaning of each parameter is as follows:

(5) The hosts file

Create a hosts file under /etc/Hadoop/conf with the following content:

slave001
slave002
slave003

In fact, the hosts file specifies the host name of the datanode node in the Hadoop cluster. Hadoop communicates and works through the host name during operation.

Start and maintain a highly available NameNode + Yarn distributed cluster

After the configuration of one node is completed, Hadoop can directly copy the configuration file to several other nodes. After all the configurations are completed, each service of Hadoop can be started. When starting the service, be very careful and strictly follow the steps described here, and check whether the operation is correct at each step. Note that all the following operations are done by the ordinary user of Hadoop.

1. Start and format the ZooKeeper cluster

After all the nodes in the ZooKeeper cluster are configured, the ZooKeeper service can be started. On the three nodes slave001, slave002, and slave003, execute the following commands in sequence. The operations are as follows:

[hadoop@slave002 bin]$ pwd
/opt/bigdata/zookeeper/bin
[hadoop@slave002 bin]$ ./zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /opt/bigdata/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@slave002 bin]$ jps
72394 Jps
72348 QuorumPeerMain
[hadoop@slave002 bin]$ 

After startup, through the jps command (JDK built-in command), you can see that there is a QuorumPeerMain logo, which is the process started by ZooKeeper, and the number in front is the PID of the process. A zookeeper.out file will be generated in the current directory where the startup command is executed. This is the running log of ZooKeeper. You can view the running status of ZooKeeper through this file. Next, you also need to establish the corresponding node information of HA on the ZooKeeper cluster. Execute the following commands on the namenodemaster node:

In this way, the formatting of the ZooKeeper cluster is completed.

2. Start the JournalNode cluster

According to the previous plan, the JournalNode cluster is installed on three nodes, slave001, slave002, and slave003. Therefore, to start the JournalNode cluster, you need to execute the following commands on these three nodes:

[hadoop@salve001 ~]$ hdfs --daemon start journalnode
[root@salve001 ~]# mkdir -p  /data1/hadoop/dfs/jn
[root@salve001 ~]# chown -R hadoop:hadoop /data1
[root@salve001 ~]# su - hadoop
Last login: Wed Feb  3 10:05:42 EST 2021 on pts/0
[hadoop@salve001 ~]$ hdfs --daemon start journalnode
[hadoop@salve001 ~]$ jps
74433 Jps
73177 QuorumPeerMain
74394 JournalNode
[hadoop@salve001 ~]$ 

3. Format and start the primary node NameNode service

Before the NameNode service is started, it needs to be formatted. The purpose is to generate NameNode metadata. Run the following command on the namenodemaster:

Then a directory will be generated under the directory specified by the hdfs-site.xml configuration file dfs.NameNode.name.dir parameter, which is used to save the fsimage, edits and other files of the NameNode. After the formatting is complete, you can start the NameNode service on the namenodemaster. Starting the service is very simple, just execute the following command:

[hadoop@namenodemaster ~]$ hdfs --daemon start namenode
[hadoop@namenodemaster ~]$ jps
45066 Jps
45022 NameNode
[hadoop@namenodemaster ~]$ 

As you can see, a new Java process NameNode will be generated. If the startup fails, check the log to troubleshoot the problem. The log file is located in the logs directory of the Hadoop installation directory by default. For example, if the NameNode log file has a name similar to hadoop-hadoop-namenode-namenodemaster.log, you can check the log to check where there is a problem, and then solve the problem.

4. NameNode primary and standby nodes synchronize metadata

Now that the primary NameNode service has been started, the standby NameNode also needs to start the service, but before starting, the metadata needs to be synchronized, that is, the metadata on the primary NameNode is synchronized to the standby node. The synchronization method is very simple. Simple, just execute the following command on the standby NameNode:

There is no error in the command execution, then the metadata should have been synchronized to the standby node.

5. Start the NameNode service of the standby node

After the backup machine completes the metadata synchronization, it also needs to start the NameNode service. The startup process is as follows:

[hadoop@yarnserver ~]$ hdfs --daemon start namenode
[hadoop@yarnserver ~]$ jps
75411 NameNode
75480 Jps

A new Java process NameNode will be generated, indicating that the startup is successful. If the startup fails, you can also check the startup log to troubleshoot the problem. 

6. Start ZooKeeper FailoverController (zkfc) service

After both NameNodes are started, they are in the Standby state by default. To transform a node into the Active state, you need to start the zkfc service on this node first.

First execute the start zkfc command on the namenodemaster, the operation is as follows:

[hadoop@namenodemaster ~]$ hdfs --daemon start zkfc
[hadoop@namenodemaster ~]$ jps
45299 Jps
45239 DFSZKFailoverController
45022 NameNode
[hadoop@namenodemaster ~]$ 

In this way, the NameNode status of the namenodemaster node will become Active, that is, become the master node of HA. Then start the zkfc service on the yarnserver, and then the NameNode status on the yarnserver will remain as standby. So far, the dual NameNode services have been started normally.

7. Start the storage node DataNode service

The DataNode node is used for HDFS distributed file system storage. According to the previous plan, the DataNode service needs to be started on slave001, slave002, and slave003 in sequence. Here, slave001 is taken as an example. The operation is as follows:

[hadoop@salve001 ~]$  hdfs --daemon start datanode
[hadoop@salve001 ~]$ jps
75378 QuorumPeerMain
78295 DataNode
78326 Jps
74394 JournalNode

8. Start ResourceManager, NodeManager and historyserver services

After the distributed storage HDFS service is started, you can perform operations related to data storage. Next, you need to start distributed computing services, mainly ResourceManager and NodeManager. First, start the ResourceManager service. According to the previous configuration, start the ResourceManager service on the yarnserver host. The operation is as follows:

[hadoop@yarnserver ~]$ yarn --daemon start resourcemanager
[hadoop@yarnserver ~]$ jps
75411 NameNode
75622 ResourceManager
75674 Jps

Then start the NodeManager service on slave001, slave002, and slave003 in turn. Take slave001 as an example. The operations are as follows:

[hadoop@salve001 ~]$  yarn --daemon start nodemanager
[hadoop@salve001 ~]$ jps
78640 NodeManager
75378 QuorumPeerMain
78661 Jps
78295 DataNode
74394 JournalNode
[hadoop@salve001 ~]$ 

In this way, the NodeManager and ResourceManager services are started, and distributed computing can be performed. Finally, you need to start the historyserver service. This service is used for log viewing. After each job of the distributed computing service runs, there will be log output. Therefore, it is necessary to enable the historyserver. You can start the historyserver service on the yarnserver node with the following command , The operation is as follows:

[hadoop@yarnserver ~]$ mapred --daemon start historyserver
[hadoop@yarnserver ~]$ jps
75411 NameNode
75924 JobHistoryServer
75622 ResourceManager
75981 Jps

At this point, the Hadoop cluster service is fully started, and the distributed Hadoop cluster deployment is completed.

9. Test the dual NameNode high availability function

Under normal circumstances, in the highly available NameNode, the namenodemaster host is in the Active state. Visit http://namenodemaster:50070 to get the following screenshot:

It can be seen from the figure that the namenodemaster is currently in Active state, and you can also see Namespace, Namenode ID, Version, Cluster ID and other information, which are defined in the introduction to the configuration file. In addition, you can also see the Summary information of HDFS, which has been introduced in the previous class; in addition, there are NameNode Journal Status, NameNode Storage and DFS Storage Types and other information, as shown in the following figure:

Among them, the NameNode Journal Status section displays the node information of the JournalNode cluster and the position where the edits file currently in progress is written. The NameNode Storage section shows the storage path of the NameNode metadata. You can see that there are two copies of the metadata that mirror each other, and the status is active. This indicates that the two metadata are normal. If the metadata status is not Active, the metadata is Problem, you need to check the metadata information under the corresponding path.

Yarnserver host, at this time yarnserver should be in standby state, visit http://yarnserver:50070, and get the following screenshot

The HDFS cluster status displayed on Yarnserver is basically the same as that on the namenodemaster host. The difference is the NameNode Journal Status section. Because it is in the standby status, it is read-only for the JournalNode cluster.

Let's test whether the next two NameNodes can realize the automatic switching function. The test method is very simple. You can shut down the service of the currently active NameNode or cut off its network, and then observe whether the other NameNode will automatically change from the Standby state to Active status.

If the high-availability NameNode configuration is normal, then when the Active NameNode fails, the Standby NameNode will automatically take over the service within a few seconds and change the status to Active. You can test it manually.

10. Verify that Yarn is running properly

After the ResourceManager and NodeManager services are started, you can  check whether the Yarn resource manager is running normally by visiting http://192.168.1.41:8088/cluster/nodes

Guess you like

Origin blog.csdn.net/yanghuadong_1992/article/details/113620806