Why Should HBase RegionServer & Hadoop DataNode Colocate?

Some basic background information first, HBase, as a distributed NoSQL database, its slave (worker) node is named “RegionServer”, all data reading, writing or scanning workloads are on these RegionServers. On the other hand, as a member of Hadoop family, HBase does not re-invent data storage service, it works on HDFS directly, more precisely, the underlying data storage of RegionServer is “DataNode”, see following diagram. 本文原文出处: 本文原文链接: http://blog.csdn.net/bluishglc/article/details/60739183 转载请注明出处。

这里写图片描述

Above diagram is drawn by me, all data for a RegionServer are not 100% read locally, but HBase can make it as most as possible, you can also google by yourselves, here is a reference doc for HBase architecture, you can browse first chart in this article: An In-Depth Look at the HBase Architecture

Both “RegionServer” and “DataNode” are not physical conceptions, they are just “services”, any servers running them can be called “RegionServer” or “DataNode”. Making an analogy with HBase and MySQL or any RMDB, RegionServer is equal to MySQL process, DataNode is equal to Ext4, NTFS or any other file systems. All physical files storing data of HBase are stored on DataNode. So it is easy to imagine that for a HDFS/HBase cluster, each physical node has to install & start RegionServer and DataNode both, in other word, RegionServer & DataNode always co-exist on all slave nodes of a Hadoop cluster. This is also SOH infrastructure architecture followed. Please also check out this thread on the same topic we discussed: Should the HBase region server and Hadoop data node on the same machine?

If an existing Hadoop cluster with dozens or perhaps more than a hundred of nodes want to allocate several instances as dedicated HBase nodes, it will has following trouble:

这里写图片描述

This diagram is very clear, and the trouble is clear too: data storage is not along with data process, so data read, write and scan will go through network, cross multiple servers.

This is totally same as this case: we installed MySQL on a server, but we didn’t let MySQL read/write data from local disk, instead of a REMOTE SHARE FOLDER on other servers, we all know how slow to open a file from a remote share folder, so how could it be terrible if there are 10 TB level data on a remote share folder for read/write? And worse, it must be real time.

It seems there are 2 possible improvements:

Let the RegionServer be a DataNode too.
Let all DataNode be RegionServer too.

For option 1, it is helpless. Let’s say, if the existing Hadoop cluster has 100 instances , 95 instances are pure DataNodes, and 5 instances are DataNode + RegionServer, actually, there are only 5% data can read/write locally, other 95% are still from remote instances, because, to avoid data skew, HDFS has to distribute data on the 100 DataNodes evenly.

For option 2, first, it’s a big action/decision for the existing Hadoop cluster, this is an architecture level change, and even it can do so, the actual result still is sad. here is reason:

Almost the same reason of that RegionServer need co-exist with DataNode, For Yarn, a NodeManager also always coexist with a DataNode, NodeManager take charge of running M/R jobs, it also try best to read/write data locally. Jobs running on Yarn can be allocated and coordinated hardware resources via Yarn, but not for HBase! From resource allocation view, HBase and Yarn are competitive, so generally, we scarcely see cases that installing HBase and Yarn on same instances for production env. This is not the fault of HBase or Yarn, they do totally different jobs, it’s hard to allocate appropriate & balanced resources for them.

Why Should HBase RegionServer & Hadoop DataNode Colocate?

猜你喜欢