How to install CDH big data environment aggressively (with personal opinions)

Translated from: How-to: Deploy Apache Hadoop Clusters Like a Boss
source article link: https://blog.cloudera.com/how-to-deploy-apache-hadoop-clusters-like-a-boss/

Introduction

       Although it can be seen that this is a 15-year blog, I still refer to this for some basic considerations when installing big data. I personally feel that it is very good. Baidu seems to have not found a translated version. Feel free to write this based on my own experience. It depends on the original English version or the link on the top.
       This article does not specifically describe how to build a big data environment, but focuses on some important system and hardware optimizations, which can avoid some problems that need to be modified in the future. It is suitable for friends who have just completed or are about to build. .

Learn how to set up a Hadoop cluster in a way that maximizes the success of Hadoop production and minimizes ongoing long-term adjustments.

       Previously, we issued some recommendations regarding the selection of new hardware for Apache Hadoop deployment. That article covered some important ideas about cluster planning and deployment, such as workload analysis and general recommendations on CPU, disk, and memory allocation. In this article, we will provide some best practices and guidelines for the next part of the implementation process: configure the machine after it arrives. In between these two articles, you will be a good starting point for Hadoop to scale production.
       Specifically, we will cover some important decisions you must make to ensure that the network, disks, and hosts are properly configured. We will also explain how to arrange disks and services to efficiently use data and minimize problems as the data set expands.

Network module: May all your SYN be forgiven

       Host name resolution scheme, DNS and FQDNS
       Hadoop Java processes such as DataNode obtain the host name of the host running on it, and then perform a lookup to determine the IP address. It will then use this IP to determine the canonical name stored in the DNS service or locally in /etc/hosts. Each host must be able to perform a forward lookup on its own hostname and be able to perform a reverse lookup using its own IP address. In addition, all hosts in the cluster need to resolve other hosts. You can use the Linux host command to verify that the forward and reverse lookups have been configured correctly.

$ host `hostname`
bp101.cloudera.com has address 10.20.195.121
$ host 10.20.195.121
121.195.20.10.in-addr.arpa domain name pointer bp101.cloudera.com

Cloudera Manager uses a simple python command to test the analysis

$ python -c 'import socket; print socket.getfqdn(), socket.gethostbyname(socket.getfqdn())'

       Although I would like to rely on /etc/hosts for this step, we recommend using DNS instead. Compared with using the hosts file, DNS is less error-prone and makes changes easier to implement. The host name should be set to a fully qualified domain name (FQDN). It is important to note that the use of Kerberos requires the use of FQDN, which is important for enabling security features such as TLS encryption and Kerberos. You can use the following command to verify:

$ hostname --fqdn
bp101.cloudera.com

       If you are using the /etc/hosts file for control, please make sure to arrange them in the proper order

192.168.2.1 bp101.cloudera.com bp101 master1
192.168.2.2 bl102.cloudera.com bp102 master2

Name server cache

       Hadoop widely uses network-based services such as DNS, NIS, and LDAP. To help alleviate network failures, reduce pressure on shared infrastructure, and improve name resolution delays, enabling the name server cache daemon (nscd) may help. nscd caches the results of both local and remote calls in memory, usually avoiding potential network round trips. In most cases, you can enable nscd, let it work, and then keep it. If you are running the Red Hat System SSSD protocol, you need to modify the nscd configuration; when SSSD is enabled, do not use nscd to cache passwd, group or netgroup information.
Gateway device link aggregation (LINK AGGREGATION)
       is also called NIC bonding or NIC grouping, which refers to combining network interfaces to improve throughput or redundancy. The exact setting will depend on your environment.
       There are many different ways to bind interfaces. Generally, we recommend binding throughput rather than availability, but this trade-off will largely depend on the number of interfaces and internal network strategies. NIC bonding is one of the most advanced drivers for Cloudera misconfiguration. We generally recommend enabling the cluster and verifying all work before enabling bonding, which will help resolve any issues you may encounter.

Personal opinion: If you build your own computer room, don't care about the cost of one or two gateways. Link aggregation is still very effective for throughput. Especially now that the amount of data is getting larger and larger, don't let the traffic of the switch limit you 10 Gigabit ports of physical machines.

Virtual Local Area Network (VLAN)

       VLANs are not necessary, but from a network perspective, they can make things easier. It is recommended to migrate to a dedicated switching infrastructure for production deployment to utilize as much other traffic on the network as possible. Then, make sure that all Hadoop traffic is on one VLAN for easy troubleshooting and isolation.

Personal opinion: If it is a self-built IDC, remember that the rack is sensitive when placing the host, and the master should be opened as much as possible, and the VLAN should be larger. If it is a cluster, there is actually no need to divide the internals too finely. .

Operating System (OS)

       Cloudera Manager does a good job of identifying known and common problems in operating system configuration, but please double check the following:

Firewall (IPTABLES)

       Some customers completely disabled IPTABLES in their initial cluster setup. Of course, from a management point of view, this makes things easier, but it also brings some risks. Depending on the sensitivity of the data in the cluster, you may want to enable IPTABLES. Hadoop requires many ports to communicate through many ecosystem components, but our documentation will help you configure it.
Attached CDH port official table link: https://docs.cloudera.com/documentation/manager/5-0-x/Cloudera-Manager-Installation-Guide/cm5ig_config_ports.html

Personal opinion: Iptbales, or the default firewalld of Centos7 still needs to be turned on, prepared for no trouble, you can set a policy to open all ports on the intranet if you are really lazy.

SELINUX

       Building a SELINUX strategy that manages all the different components in the Hadoop ecosystem is a challenge, so most of our customers run with SELINUX disabled. If you are interested in running SELINUX, make sure to verify that it is on a supported OS version. We recommend enabling the licensing mode only at the beginning so that you can capture the output to define a strategy that meets your needs.

Personal opinion: SELINUX is too troublesome, remove it and disable it.

Swap memory (SWAPPINESS)

       The traditional recommendation for worker nodes is to set swappiness (vm.swappiness) to 0. However, this behavior has changed in newer kernels, and we now recommend setting it to 1. (This article has more details.)

$ sysctl vm.swappiness=1
$ echo "vm.swappiness = 1" >> /etc/sysctl.conf

System LIMITS settings

       The default file handle limit (ie ulimits) of 1024 for most distributions may not be set high enough. Cloudera Manager will solve this problem, but if you are not running Cloudera Manager, please pay attention to this fact. Cloudera Manager does not change user limits beyond Hadoop's default limits. Nevertheless, increasing the global limit to 64k is still beneficial.

Transparent huge page (THP)

       Most Linux platforms supported by CDH 5 include a feature called "Transparent Huge Page Compression", which interacts poorly with Hadoop workloads and may severely degrade performance. Red Hat claims that the bug has been fixed in versions after 6.4, but there are still remnants that can cause performance problems. We recommend disabling defragmentation until further testing is possible.
The following is the path of defrag_file_pathname for different systems
Red Hat / CentOS: /sys/kernel/mm/redhat_transparent_hugepage/defrag
Ubuntu / Debian, OEL, SLES: /sys/kernel/mm/transparent_hugepage/defrag

$ echo 'never' > defrag_file_pathname

Remember to add the relevant statements to /etc/rc.local, otherwise it will be embarrassing if the restart is invalid.


       Each server in the time zone must install the NTP service and make sure it is available

Storage

       Properly configuring the cluster's storage is one of the most important initial steps. Failure to do this properly will cause pain, because changing the configuration can be devastating and usually requires a complete redo of the current storage layer.

Operating system, log drive and data drive

       A typical 2U machine is equipped with 16 to 24 drive bays for dedicated data drives, and some drives (usually two) dedicated to operating systems and logs. The design principle of Hadoop is simple: "hardware failure". In this way, it will withstand disk, node and even rack failures. (This principle is indeed beginning to gain popularity in large-scale areas, but let's face it: if you are reading this blog, you may not be on Google or Facebook.)
       Even at the normal human scale (less than 4,000 nodes), Hadoop can still withstand the impact of hardware failures like the boss, but it is reasonable to add some extra redundancy to reduce these failures. As a general guideline, we recommend using RAID-1 (mirroring) for the OS drive (that is, the hard disk where the root path of the system disk is located) to keep the data node's ticking sound for a longer time if the OS drive is lost. Although this step is not absolutely necessary, in smaller clusters, the loss of a node may result in a significant reduction in computing power.
       Other drives should be configured with JBOD ("a bunch of disks") on systems running RHEL6+, Debian 7.x or SLES11+, with a separately installed ext4 partition. In some hardware profiles, a single RAID-0 volume must be used when a mandatory RAID controller must be built for that particular computer. This method has the same effect as installing the drive as a single spindle.
       There are some mount options that may be useful. These contents are well introduced in the book Hadoop Operations, the author Alex Moundalexis, here is a little mention.

Root directory reserved space requirements

       By default, both ext3 and ext4 reserve 5% of the blocks on a given file system for the root user. HDFS data directories do not need this reservation, but you can adjust it to zero when creating partitions or after using mkfs and tune2fs respectively.

Personal opinion: It is not recommended to adjust to 0. It is not necessary. If the allocation is reasonable, the practical space of the root directory is very small. If the non-master node does not store data and only puts the program, even an ordinary 300G hard disk is more than enough. In the case of respect for originality, let go of the reset command

$ mkfs.ext4 -m 0 /dev/sdb1
$ tune2fs -m 0 /dev/sdb1

File system access time configuration

       The Linux file system maintains metadata, which records the last time each file was accessed, so even reads are written to disk. This timestamp is called atime (ie Access Time) and should be disabled on drives configured for Hadoop. Set through the mount option in /etc/fstab:

/dev/sdb1 /data1    ext4    defaults,noatime       0

After the configuration is completed, it can take effect without restarting. The instructions are as follows.

mount -o remount /data1

Directory permissions

       This is only a minor point, but before mounting the data drive, you should consider changing the permissions of the directory to 700. Therefore, if the drive is unmounted, the processes writing to these directories will not fill up the OS mount.

LVM, RAID or JBOD (the display of three hard disks)

       We are often asked whether we need JBOD configuration, RAID configuration or LVM configuration. The creation of the entire Hadoop ecosystem takes into account the JBOD configuration. HDFS is an immutable file system designed for large files with longer sequential reads. This target works well with standalone SATA drives because they can achieve the best performance through sequential reads. In short, RAID is usually used to add redundancy to existing systems, while HDFS has built-in redundancy. In fact, using a RAID system with Hadoop may have a negative impact on performance.
       Both RAID-5 and RAID-6 add parity bits to the RAID stripe. These parity bits must be written and read during standard operations, and will add a lot of overhead. Independent SATA drives will write/read continuously without worrying about parity bits, because they don't exist. In contrast, HDFS takes advantage of having multiple separate mount points and can allow a single drive/volume to fail before a node fails-this is not the secret of HDFS parallel I/O. Setting up a drive in a RAID-5 or RAID-6 array will create a single array or several very large arrays of mounting points depending on the drive configuration. These RAID arrays will destroy the natural promotion of HDFS for data protection, slow down the sequential read speed and the data locality of Map tasks.
RAID arrays can also affect other systems that require a large number of installation points. For example, Impala will add a thread for each spindle in the system. Compared with a large single RAID group, it will have good performance in a JBOD environment. For the same reason, it is neither necessary nor recommended to configure the Hadoop driver under LVM.

Personal opinion: In short, try JBOD as much as possible except for the system disk RAID1. If it is not possible, RAID0 must never be LVM! ! !

Hybrid hardware deployment

       Many customers regularly purchase new hardware; as the amount of data and workload increases, it makes sense to add a new generation of computing resources. For such environments containing heterogeneous disks, memory or CPU configurations, Cloudera Manager allows the use of role groups. Administrators can use role groups to specify the memory, YARN container and Cgroup settings for each node or each node group.
       Although Hadoop can of course run under mixed hardware specifications, we recommend keeping the worker node configurations as the same as possible. In a distributed computing environment, the workload is distributed among various nodes, so it is best to optimize local data access. Nodes configured with fewer computing resources may become a bottleneck, and running with a mixed hardware configuration may result in greater changes in the SLA window. There are several things to consider:

  • Hybrid hard disk configuration -By default, HDFS blocks are placed in all directories specified by dfs.data.dir to work in a circular manner. For example, if your node has six 1.2TB drives and six 600GB drives, the smaller drives will be filled faster, resulting in an imbalance in capacity. Using the "free space" strategy requires additional configuration. In this case, the I/O bound workload may be affected because you may only be writing a part of the disk. Know in advance what it means to deploy drives in this way. In addition, if you deploy nodes with more overall storage, remember that HDFS is balanced by percentage.
  • Mixed memory configuration -Mixing available memory in a worker node can cause problems because it does require additional configuration.
  • Mixed CPU configuration -the same concept; jobs may be limited by the slowest CPU, which actually negates the benefit of running newer/more cores.
    It is important to understand the above points, but remember that Cloudera Manager can help allocate resources to different hosts. Allows you to easily manage and optimize the configuration.

    Personal opinion: The hard disk configuration problem is not big, especially when the overall storage capacity is not full (more than 85%), but including the memory frequency, the CPU computing power will indeed cause a certain difference, but there is no need to be faulty. YARN calculation is mainly based on easy running. Generally, a single core is not too bad under the premise of normal system operation.

Domineering configuration CDH Manager (Cloudera Manager Like A Boss)

       We strongly recommend using Cloudera Manager to manage your Hadoop cluster. Cloudera Manager provides many valuable features to make life easier. The Cloudera Manager documentation is very clear about this, but in order to eliminate any ambiguity, here are the high-level steps for a production-ready Hadoop deployment using Cloudera Manager.
1. Set up an external database and pre-create the architecture required for deployment. (For MySQL example, remember to change the password)
create database amon DEFAULT CHARACTER SET utf8;
grant all on amon. TO'amon'@'%' IDENTIFIED BY'amon_password';
create database rman DEFAULT CHARACTER SET utf8;
grant all on rman.
TO 'rman'@'%' IDENTIFIED BY'rman_password';
create database metastore DEFAULT CHARACTER SET utf8;
grant all on metastore. TO'metastore'@'%' IDENTIFIED BY'metastore_password';
create database nav DEFAULT CHARACTER SET utf8;
grant all on nav.
TO'nav'
create database sentry DEFAULT CHARACTER SET utf8;
grant all on sentry.* TO 'sentry'@'%' IDENTIFIED BY 'sentry_password';
(Please change the passwords in the examples above!)

2. Install cloudera-manager-server and cloudera-manager-daemons packages according to the documentation.

yum install cloudera-manager-server cloudera-manager-daemons

The original text gave an installation instruction, which is a bit lazy. For details, you can read the official installation document first. I think it is still quite detailed with a
bonus link: https://docs.cloudera.com/documentation/enterprise/release-notes /topics/rg_release_notes.html

3. Run the scm_prepare_database.sh script to determine the database type.

/usr/share/cmf/schema/scm_prepare_database.sh mysql -h cm-db-host.cloudera.com -utemp -ptemp --scm-host cm-db-host.cloudera.com scm scm scm

Note: This was written earlier, now the new version of CDH does not necessarily use this script

4. Start Cloudera Manager Service and follow the wizard from then on.
This is the easiest way to install Cloudera Manager, it will enable you to start production-ready deployment in 20 minutes.
Note: The so-called wizard should be the web wizard after starting the master: port 7180. For specific installation, please see the link above.

Play a role: service layout guide

       For Cloudera Manager-based deployments, the following figure provides a reasonable way to deploy service roles across clusters in most configurations.
How to install CDH big data environment aggressively (with personal opinions)

       In larger clusters (more than 50 nodes), it may be necessary to move to five management nodes and use dedicated nodes for ResourceManager and NameNode pairs. In addition, it is not uncommon to use external databases for Cloudera Manager, Hive Metastore, etc., and other HiveServer2 or HMS services can also be deployed.
       We recommend 128GB for each management node and 256-512GB for working nodes. Memory is relatively cheap, and as computing engines increasingly rely on in-memory execution, the extra memory will be fully utilized.
       For a deeper dive, the following diagram describes the appropriate disk mapping to various service storage components.
How to install CDH big data environment aggressively (with personal opinions)

How to install CDH big data environment aggressively (with personal opinions)

       We specify LVM for the Cloudera Manager database here, but RAID 0 can also be selected.
Note: I don’t agree with this point. The database is actually the most taboo. Use RAID1.
How to install CDH big data environment aggressively (with personal opinions)

Conclusion
       Once you have the proper knowledge, setting up a Hadoop cluster is relatively simple. Take the extra time to purchase the right infrastructure and configure it correctly from the start. Following the above guidelines will provide you with the best chance of success in Hadoop deployment, and you can avoid troublesome configuration so that you can focus your time on solving real business problems, such as your boss.

Personal opinion: Many of the whole article can be used for reference. It is very good. Most people's clusters will not have thousands of units. You can refer to this to make certain configurations.

Guess you like

Origin blog.51cto.com/14839701/2549233