Ozone talk of Topology Awareness

Foreword


As we all know, in large-scale distributed storage systems, data is often to maintain its availability in the form of multiple copies. But how multiple copy is placed in a position worth digging area. At least we can not all copies are placed in extreme cases such a machine. For placing a copy of the data, in fact, considerations related to, or more, to place a copy too decentralized, network traffic overhead hit, but too close to read it to increase locality, redundancy will be affected, for example, a rack machine suddenly malfunction and so on. Here we have this type of problem is called Topology Awareness of the problem, let's talk about the content of the Ozone Awareness aspects of Topology.

HDFS的Topology Awareness


I first learned Topology Awareness is when you learn HDFS Block placement of, HDFS comply with 3 copies of the block in accordance with the same rack, three copies of extraordinary rack placement policy principles to achieve high data availability purposes. This placement policy course without any major problems, it can tolerate a machine or a rack unavailable cause actual data is not available.

So here, the administrator to configure what kind of topology rules it is very important too. We can use the most simple rules, such as physical rack logical topology position rack. If we do not configure any of the rules of topology, the location of all network machines will become the default-rack, default-rack is not a good setting.

HDFS Topology hierarchy, from problem definition


Administrator based on the actual physical location of the cluster nodes, topology location for a given position, it is not difficult. But the question is, sometimes we need to build a more complex topology structure, not just the data center-> rack-> node tertiary structure such.

For example, we would like to do a group at the rack level, the structure becomes four, or possibly on the group level do individual property division. Topology of HDFS currently does not support 5 level, group level has a corresponding implementation class. Therefore, in the custom layer layer, HDFS Topology Awareness is not yet reached a sufficient degree of flexibility.

/dc
|
/rackgroup
|
/rack
|
/node

Another point, multi-level division of the Topology aspect is to show different positions of the different physical home of each node, there are different levels of point distance. This distance may be understood as a data transmission network overhead, data transmission between nodes across the rack slower than the data transfer between different nodes of the rack, the former distance on a little longer. So here, we can define different distance cost directly to a different layer. For example / dc to / rack is 2, / rack to / node is 1.

In conventional HDFS Topology rules, there is no concept of distance cost custom, the distance between the different layer are equivalent, it is to be understood as a unit length.

Why so here we emphasize the concept of network distance cost of it? Because this would involve the issue of the most recent data from the read. For client data read operation, the selected data read its nearest node is clearly a better strategy method.

Ozone is also used as a data storage system, the inadequacies of its points mentioned above in HDFS Topology Awareness 2 have done further refined and improved to make it into a more flexible Topology Awareness.

Ozone is configured to use the Topology Awareness


Ozone core idea of ​​Topology Awareness and HDFS similar, is also based on the same rack, multiple copies of different rack placement policy. Here we focus on the custom layer and its distance cost settings.

In order to support different compared to HDFS layer, achieved by defining a new subclass of this somewhat cumbersome way, Ozone is the definition of the concept of node schema to support user-defined topology.

In a Schema node, which contains the type of node, the type herein are the following three categories:

  • ROOT(“Root”, NetConstants.INNER_NODE_COST_DEFAULT),
  • INNER_NODE(“InnerNode”, NetConstants.INNER_NODE_COST_DEFAULT),
  • LEAF_NODE(“Leaf”, NetConstants.NODE_COST_DEFAULT);

Then its cost value back property.

The well was then loaded in a list node schema is NodeSchemaManager way to maintain, in accordance with the depth order, from top to bottom.

Here is defined by external schema file determines the incoming topology, the following topology file a custom depth of 4, the equivalent of HDFS NetworkTopologyWithNodeGroup class.

<?xml version="1.0"?>
<configuration>
    <layoutversion>1</layoutversion>
    <layers>
        <layer id="datacenter">
            <prefix></prefix>
            <cost>1</cost>
            <type>Root</type>
        </layer>
        <layer id="rack">
            <prefix>rack</prefix>
            <cost>1</cost>
            <type>InnerNode</type>
            <default>/default-rack</default>
        </layer>
        <layer id="node">
            <prefix></prefix>
            <cost>0</cost>
            <type>Leaf</type>
        </layer>
    </layers>
    <topology>
        <path>/datacenter/rack/node</path>
        <!-- When this field is true, each InnerNode layer should has its prefix defined with not empty value,
         otherwise the content is not valid. Default value is false.
         -->
        <enforceprefix>false</enforceprefix>
    </topology>
</configuration>

We then matched topology Mapping, to specify the position of each node topology, according to the above schema model, Layer 4 layer depth. The above configuration depends on the actual physical cost overhead cost, configuration.

 hostrack file:
 xx(lyq-s2xx)    /dc1/rack1
 xx(lyq-s4xx)    /dc1/rack2
 xx(lyq-s3xx)    /dc2/rack1
 xx(lyq-s1xx)    /dc2/rack2

where dc corresponds to the group on the rack. Then configure the Ozone topology configuration items,

<property>
   <name>ozone.scm.network.topology.schema.file</name>
   <value>/home/hdfs/apache/ozone/etc/hadoop/ozone-topology-default.xml</value>
</property>

<property>
  <name>net.topology.node.switch.mapping.impl</name>
  <value>org.apache.hadoop.net.ScriptBasedMapping</value>
</property>

<property>
  <name>net.topology.script.file.name</name>
  <value>/home/hdfs/apache/ozone/etc/hadoop/topology.py</value>
</property>

python script parsing,

import sys
import os
import pwd
sys.argv.pop(0)
rack_dic = {}
try:
   with open("/home/hdfs/apache/ozone/etc/hadoop/hostrack") as f:
        for line in f:
           (key, val) = line.split()
           rack_dic[key] = val
   for ip in sys.argv:
        print '{0}'.format(rack_dic[ip])
except Exception,e:
    print "/default-rack"

Once configured, we restart scm, and then execute the command command printTopology, we can see that each node has been loaded with the latest topology location,

[hdfs@lyq hadoop]$ ~/apache/ozone/bin/ozone scmcli printTopology
State = HEALTHY
 xx(lyq-s2xx)    /dc1/rack1
 xx(lyq-s4xx)    /dc1/rack2
 xx(lyq-s3xx)    /dc2/rack1
 xx(lyq-s1xx)    /dc2/rack2

If the user does not pass a level of topology mapping, it will result in an error similar to the following node registration,

2019-12-18 07:56:58,579 WARN org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9861, call Call#9916 Retry#0 org.apache.hadoop.ozone.protocol.StorageContainerDatanodeProtocol.submitRequest from xx.xx.xx.xx:40568
org.apache.hadoop.hdds.scm.net.NetworkTopology$InvalidTopologyException: Failed to add /dc2/rack2/0d98dfab-9d34-46c3-93fd-6b64b65ff543: Its path depth is not 3
	at org.apache.hadoop.hdds.scm.net.NetworkTopologyImpl.add(NetworkTopologyImpl.java:100)
	at org.apache.hadoop.hdds.scm.node.SCMNodeManager.register(SCMNodeManager.java:263)
	at org.apache.hadoop.hdds.scm.server.SCMDatanodeProtocolServer.register(SCMDatanodeProtocolServer.java:225)
	at org.apache.hadoop.ozone.protocolPB.StorageContainerDatanodeProtocolServerSideTranslatorPB.register(StorageContainerDatanodeProtocolServerSideTranslatorPB.java:69)
	at org.apache.hadoop.ozone.protocolPB.StorageContainerDatanodeProtocolServerSideTranslatorPB.processMessage(StorageContainerDatanodeProtocolServerSideTranslatorPB.java:104)
	at org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72)
	at org.apache.hadoop.ozone.protocolPB.StorageContainerDatanodeProtocolServerSideTranslatorPB.submitRequest(StorageContainerDatanodeProtocolServerSideTranslatorPB.java:77)

Ozone Topology Awareness applications


After Ozone topology awareness configuration used up, it will play a role in it in what areas? Mainly in the following two aspects:

  • First, do replication in Container time to select a node topology awareness in line with the conditions of doing Container-level replication.
  • Second, when selecting Container Pipeline, composed Pipeline internal node topology awareness needs to satisfy the rule.

Above, this article is talking about today Topology Awareness on Ozone system within the relevant content, and the HDFS Topology Awareness similar but a lot of flexibility.

Quote


[1].https://issues.apache.org/jira/browse/HDDS-698 . Support Topology Awareness for Ozone

Published 373 original articles · won praise 403 · Views 2.03 million +

Guess you like

Origin blog.csdn.net/Androidlushangderen/article/details/103658295