One of the CDH + Kylin trilogy: Preparation

This article is the first in the "CDH + Kylin Trilogy". The entire series consists of the following three:

  1. Preparation: Before setting up the CDH + Kylin environment, prepare all hardware and software resources
  2. Deployment and settings: Deploy CDH and Kylin, and then make related settings
  3. Kylin combat: run Kylin official demo on the built environment

The actual content of the entire trilogy is shown in the figure below:
Insert picture description here
Next, let's start with the most basic preparations.

About CDH and Kylin

  1. The operation of Kylin requires services such as Hadoop, Hive, HBase, etc. Therefore, it is more convenient to use CDH to centrally deploy these applications. The following figure is from Kylin official, which shows that CDH is supported:
    Insert picture description here
  2. The official said that it supports the CDH6.0 version, but it was found in the actual deployment that Kylin2.6 will have problems starting in the CDH6.0.1 environment. After trying to find that Kylin2.6 + CDH5.16 can run normally, this actual combat uses this version to match;

Deployment method

Ansible is a commonly used operation and maintenance tool, which can greatly simplify the entire deployment process. Next, ansible will be used to complete the deployment. If you do n’t know enough about ansible, please refer to "Ansible2.4 Installation and Experience" . The deployment operation is shown in the following figure. It shows that the script is run on a computer with ansible installed, and ansible is remotely connected to a CentOS7.7 server to complete the deployment:
Insert picture description here

Hardware preparation

  1. A computer that can run ansible, I use a MacBook Pro, also verified with CentOS, can successfully complete the deployment;
  2. A CentOS7.7 computer, used to run all services such as HDFS, Hive, HBase, Spark, Kylin ( CDH server in the follow-up article refers to this computer), deploying all services with one machine is only suitable for the learning and development stage , measured It is found that the CPU of this computer must be at least dual-core and the memory is not less than 16G . If you want to deploy CDH with multiple computers, it is recommended to modify the ansible script to deploy separately.

CDH server settings

You need to log in to the CDH server to make the following settings:

  1. Check if the / etc / hostname file is correct, as shown below:
    Insert picture description here
  2. Modify the / etc / hosts file, configure your own IP address and hostname, as shown in the red box below ( it turns out that this step is very important , if you do not do it , it may cause you to be stuck in the "allocation" stage during deployment, see the agent log Show that the progress of agent download parcel has been zero percent):
    Insert picture description here

Download file (ansible computer)

A total of 13 documents are to be prepared for this actual combat. Here are listed in the table below:

Numbering file name Introduction
1 jdk-8u191-linux-x64.tar.gz Linux version jdk installation package
2 mysql-connector-java-5.1.34.jar JDBC driver for MySQL
3 cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm cm server installation package
4 cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm cmemon installation package
5 cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm cm agent installation package
6 CDH-5.16.2-1.cdh5.16.2.p0.8-el7.parcel CDH application offline installation package
7 CDH-5.16.2-1.cdh5.16.2.p0.8-el7.parcel.sha CD verification code for offline installation package of CDH application
8 apache-kylin-2.6.4-bin-cdh57.tar.gz kylin installation package (suitable for CDH version)
9 hosts The remote host configuration used by ansible, which records the information of the CDH6 server
10 ansible.cfg Configuration information used by ansible
11 cm6-cdh5-kylin264-single-install.yml Ansible script used when deploying CDH
12 cdh-single-start.yml The ansible script used when starting CDH for the first time
13 var.yml Variables used in the script are set here,
such as CDH package name, flink file name, etc., for easy maintenance

The following is the download address of each file:

  1. jdk-8u191-linux-x64.tar.gz: Oracle's official website is available. In addition, I packaged and uploaded jdk-8u191-linux-x64.tar.gz and mysql-connector-java-5.1.34.jar to csdn, you Can be downloaded at one time, address: https://download.csdn.net/download/boling_cavalry/12098987

  2. mysql-connector-java-5.1.34.jar: maven central warehouse is available. In addition, I package and upload jdk-8u191-linux-x64.tar.gz and mysql-connector-java-5.1.34.jar to csdn You can download it once, address: https://download.csdn.net/download/boling_cavalry/12098987

  3. cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm:https://archive.cloudera.com/cm6/6.3.1/redhat7/yum/RPMS/x86_64/cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm

  4. cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm:https://archive.cloudera.com/cm6/6.3.1/redhat7/yum/RPMS/x86_64/cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm

  5. cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm:https://archive.cloudera.com/cm6/6.3.1/redhat7/yum/RPMS/x86_64/cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm

  6. CDH-5.16.2-1.cdh5.16.2.p0.8-el7.parcel:https://archive.cloudera.com/cdh5/parcels/5.16.2/CDH-5.16.2-1.cdh5.16.2.p0.8-el7.parcel

  7. CDH-5.16.2-1.cdh5.16.2.p0.8-el7.parcel.sha: https://archive.cloudera.com/cdh5/parcels/5.16.2/CDH-5.16.2-1.cdh5. 16.2.p0.8-el7.parcel.sha1 (After downloading, change the extension from .sha1 to .sha)

  8. apache-kylin-2.6.4-bin-cdh57.tar.gz : https: //archive.apache.org/dist/kylin/apache-kylin-2.6.4/apache-kylin-2.6.4-bin-cdh57. tar.gz

  9. hosts, ansible.cfg, cm6-cdh5-kylin264-single-install.yml, cdh-single-start.yml, vars.yml: these five files are stored in my GitHub repository, the address is: https: // github .com / zq2599 / blog_demos, there are multiple folders inside, the above files are in the folder named ansible-cm6-cdh5-kylin264-single , as shown in the red box below:
    Insert picture description here

File placement (ansible computer)

If you have downloaded the above 13 files, please place them according to the following locations so that the deployment can be successfully completed:

  1. Create a new folder named playbooks under the home directory: mkdir ~ / playbooks
  2. Put these five files into the playbooks folder: hosts, ansible.cfg, cm6-cdh5-kylin264-single-install.yml, cdh-single-start.yml, vars.yml
  3. Create a new subfolder named cdh6 in the playbooks folder;
  4. Put these eight files into the cdh6 folder (that is, the remaining eight): jdk-8u191-linux-x64.tar.gz, mysql-connector-java-5.1.34.jar, cloudera-manager-server-6.3. 1-1466458.el7.x86_64.rpm, cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm, cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm, CDH-5.16. 2-1.cdh5.16.2.p0.8-el7.parcel, CDH-5.16.2-1.cdh5.16.2.p0.8-el7.parcel.sha, apache-kylin-2.6.4-bin-cdh57. tar.gz
  5. After the placement, the directory and files are as shown in the figure below. Remind again: the folder playbooks must be placed in the home directory (ie: ~ / ):
    Insert picture description here

ansible parameter setting (ansible computer)

The operation setting of ansible parameter setting is very simple: configure the access parameters of the CDH server, including the IP address, login account, password, etc., modify the ~ / playbooks / hosts file, as shown below, you need to modify deskmini ansible_host, ansible_port, ansible_user, ansible_password:

[cdh_group]deskmini ansible_host=192.168.50.134 ansible_port=22 ansible_user=root ansible_password=888888

At this point, all preparations have been completed, the next article we will complete these operations:

  1. Deploy CDH and Kylin
  2. Start CDH
  3. Set up CDH, install Yarn online, HDFS, etc.
  4. Adjust HDFS and Yarn parameters
  5. Modify Spark settings (otherwise Kylin will fail to start)
  6. Start Kylin

Welcome to pay attention to my public number: programmer Xinchen

Insert picture description here

Published 376 original articles · praised 986 · 1.28 million views

Guess you like

Origin blog.csdn.net/boling_cavalry/article/details/105449630