【翻译】给新的Hadoop集群选择合适的硬件（三）

接上一篇：https://my.oschina.net/u/234661/blog/855913

其他考虑因素

It is important to remember that the Hadoop ecosystem is designed with a parallel environment in mind. When purchasing processors, we do not recommended getting the highest GHz chips, which draw high watts (130+). This will cause two problems: higher consumption of power and greater heat expulsion. The mid-range models tend to offer the best bang for the buck in terms of GHz, price, and core count.

时刻牢记，Hadoop生态系统被设计成并行的环境。当购买处理器的时候，不推荐购买主频最高、能耗高（超过130w）的芯片，这会引起2个问题：更高电量消耗和更大的发热量。

When we encounter applications that produce large amounts of intermediate data — outputting data on the same order as the amount read in — we recommend two ports on a single Ethernet card or two channel-bonded Ethernet cards to provide 2 Gbps per machine. Bonded 2Gbps is tolerable for up to about 12TB of data per nodes. Once you move above 12TB, you will want to move to bonded 4Gbps(4x1Gbps). Alternatively, for customers that have already moved to 10 Gigabit Ethernet or Infiniband, these solutions can be used to address network-bound workloads. Confirm that your operating system and BIOS are compatible if you’re considering switching to 10 Gigabit Ethernet.

当应用程序产生大量的中间数据时，数据的输出顺序跟读入顺序保持一致。推荐一个网卡开放2个端口或者通道聚合网卡来提供每个机器2Gbps的带宽。2Gbps的带宽可以容纳每个节点12TB的数据。一旦你需要移动12TB的数据，你需要4Gbps的带宽（4*1Gbps）。当然，很多客户已经用上了万兆以太网卡或者无限带宽，这个解决方案可以用在受网速限制的工作场景中。记得确认你的操作系统或BIOS是否兼容万兆网卡。

When computing memory requirements, remember that Java uses up to 10 percent of it for managing the virtual machine. We recommend configuring Hadoop to use strict heap size restrictions in order to avoid memory swapping to disk. Swapping greatly impacts MapReduce job performance and can be avoided by configuring machines with more RAM, as well as setting appropriate kernel settings on most Linux distributions.

当计算内存时，记住Java虚拟机最多用10%来管理Java虚拟机。建议配置Hadoop时，明确堆的大小（heap size）来避免内存数据与磁盘交换。磁盘交换严重影响了MapReduce的性能，增加内存可以避免这个问题，当然，大多数发行版linux系统也可以通过修改恰当的内核设置。

It is also important to optimize RAM for the memory channel width. For example, when using dual-channel memory, each machine should be configured with pairs of DIMMs. With triple-channel memory each machine should have triplets of DIMMs. Similarly, quad-channel DIMM should be in groups of four.

内存通道带宽堆优化内存也很重要。举个例子，双通道内存，每个机器需要配置的2个一组DDIM。三通道配置成3个一组DDIM。同理，4通道内存，需要4个组成一组DDIM。

不只是MapReduce

Hadoop is far bigger than HDFS and MapReduce; it’s an all-encompassing data platform. For that reason, CDH includes many different ecosystem products (and, in fact, is rarely used solely for MapReduce). Additional software components to consider when sizing your cluster include Apache HBase, Cloudera Impala, and Cloudera Search. They should all be run on the DataNode process to maintain data locality.

Hadoop不只有HDFS和MapRecude。他是一个全面的数据平台。CDH包含了许多不同的程序（事实上，很少单独的用MapReduce。其他软件也需要被考虑在集群内，包括Hbase，Cloudera Impala，Cloudera Search。他们应该运行在DataNode进程上就地维护数据。

HBase is a reliable, column-oriented data store that provides consistent, low-latency, random read/write access. Cloudera Search solves the need for full text search on content stored in CDH to simplify access for new types of users, but also open the door for new types of data storage inside Hadoop. Cloudera Search is based on Apache Lucene/Solr Cloud and Apache Tika and extends valuable functionality and flexibility for search through its wider integration with CDH. The Apache-licensed Impala project brings scalable parallel database technology to Hadoop, enabling users to issue low-latency SQL queries to data stored in HDFS and HBase without requiring data movement or transformation.

HBase是一个可靠的，面向列的、连续的，低延迟、随机读写数据库。

CS基于Lucene、Solr、Tika。兼顾了灵活性和功能性，因此广泛继承在CDH中。基于Apache协议的Impala项目带来了可扩展的数据库并行技术，不需要数据移动或转换就解决了HDFS和HBase低延时SQL查询的问题。

HBase users should be aware of heap-size limits due to garbage collector (GC) timeouts. Other JVM column stores also face this issue. Thus, we recommend a maximum of ~16GB heap per Region Server. HBase does not require too many other resources to run on top of Hadoop, but to maintain real-time SLAs you should use schedulers such as fair and capacity along with Linux Cgroups.

HBase用户需要关注堆内存大小的限制，由于垃圾回收暂停。其他基于JVM的列存储系统都会面临这一问题。因此，我们建议每个RegionServer最大分配16GB堆内存。HBase在Hadoop上不需要太多其他的资源，但是为了保持实时的SLA ，应该使用调度器，如Linxu控制组提供的公平调度器和容量调度器。

Impala uses memory for most of its functionality, consuming up to 80 percent of available RAM resources under default configurations, so we recommend at least 96GB of RAM per node. Users that run Impala alongside MapReduce should consult our recommendations in “Configuring Impala and MapReduce for Multi-tenant Performance.” It is also possible to specify a per-process or per-query memory limit for Impala.

Impala大多数功能需要用到内存，默认配置下最大可以消耗80%的可用内存，建议每个节点至少96GB内存。Impala和MapReduce一起运行时，需要查看我们的意见。当然也可以指定一个核心或者每个查询的内存限制。

Search is the most interesting component to size. The recommended sizing exercise is to purchase one node, install Solr and Lucene, and load your documents. Once the documents are indexed and searched in the desired manner, scalability comes into play. Keep loading documents until the indexing and query latency exceed necessary values to the project — this will give you a baseline for max documents per node based on available resources and a baseline count of nodes not including and desired replication factor.

确定搜索组件的大小是最有趣的。推荐购买一个节点，安装Solr和Lucene，加载数据来再进行大小调整。当索引、搜索文档满足不了的时候，扩展性就发挥作用了。一致加载文档直到索引和查询延迟超过项目指定的值。这就是每个节点放多少大文档的基准线和节点的基线数量（不包括期望的副本因素）。

总结

Purchasing appropriate hardware for a Hadoop cluster requires benchmarking and careful planning to fully understand the workload. However, Hadoop clusters are commonly heterogeneous and Cloudera recommends deploying initial hardware with balanced specifications when getting started. It is important to remember when using multiple ecosystem components resource usage will vary and focusing on resource management will be your key to success.

We encourage you to chime in about your experience configuring production Hadoop clusters in comments!

购买恰当的硬件需要基准测试和仔细的规划才能完全弄明白工作场景。然而，Hadoop集群通常各种各样。我们建议初次部署配置均衡的硬件。最重要的是，不同生态组建的需求说明是不一样的，关注资源管理才是关键。

Kevin O’Dell is a Systems Engineer at Cloudera.