Flink on Yarn Trilogy Part Two: Deployment and Setup

This article is the second in a series of "Flink on Yarn Trilogy". The previous article "One of Flink on Yarn Trilogy: Preparation" has prepared the required machines and files, and you can deploy CDH and Flink;

Full text link

  1. "Flink on Yarn Trilogy One: Preparation"
  2. "
    Flink on Yarn Trilogy Part Two: Deployment and Setup
    "
  3. "Flink on Yarn Trilogy Part Three: Submit Flink Tasks"

Execute ansible script to deploy CDH and Flink (ansible computer)

  1. Enter the ~ / playbooks directory of the ansible computer , and after the preparation of the previous article, the directory should be the following content:
    Insert picture description here
  2. Check whether the ansible remote operation of the CDH server is normal. Run the ansible deskmini -a "free -m" command to display the memory information of the CDH server under normal conditions, as shown below:
    Insert picture description here
  3. Execute the command to start deployment: ansible-playbook cm6-cdh5-flink1.7-single-install.yml
  4. The entire deployment process involves time-consuming operations such as online installation and file transfer, so please be patient (about half an hour). If you exit with errors during deployment (such as network problems), you only need to repeat the above command, ansible guarantees the operation Idempotency;
  5. The successful deployment is shown below:
    Insert picture description here

Restart the CDH server

Since the settings of selinux and swap are modified, the operating system needs to be restarted to take effect, so please restart the CDH server;

Execute ansible script to start CDH service (ansible computer)

  1. Wait for the CDH server to restart successfully;
  2. Log in to ansible computer and enter ~ / playbooks directory;
  3. Execute the script to initialize the database and start the CDH: ansible-playbook cdh-single-start.yml
  4. After the startup is complete, the following information is output:
    Insert picture description here
  5. Log in to the CDH server with ssh, and execute this command to observe the start of the CDH service: tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log , when you see the content in the red box below, it means that it is started Finished, you can log in with a browser:
    Insert picture description here

Settings (browser operation)

Now that the CDH service has been started, you can operate it through the browser:

  1. Browser access: http://192.168.50.134:7180, as shown below, the account password is admin :
    Insert picture description here
  2. All the way to the next, select the 60-day trial version on the select version page:
    Insert picture description here
  3. Select the host page to see the CDH server (deskmini):
    Insert picture description here
  4. On the page for selecting the CDH version, please select 5.16.2-1 in the red box below :
    Insert picture description here
  5. Go to the page where Parcel is installed. Since the offline parcle package is uploaded in advance, the download progress becomes 100% instantly. At this time, please wait for the completion of distribution, decompression, and activation:
    Insert picture description here
  6. Next, there are some recommended operations. Here you can skip the red box as shown below:
    Insert picture description here
  7. Next is the page for selecting a service. I chose a custom service, and then selected three items: HDFS, YARN, and Zookeeper, which can meet the needs of running Flink:
    Insert picture description here
  8. On the host selection page, select the CDH server:
    Insert picture description here
  9. Next is the database settings page. The content you fill in must be consistent with the following figure , that is, the host name is localhost , the database, user, and password of Activity Monitor are all amon , and the database, user, and password of Reports Manager are all rman . These The content has been fixed in the ansible script, and the filling here must be consistent:
    Insert picture description here
  10. On the page for setting parameters, please set according to the actual situation of your hard disk. I have enough space under the / home directory, so the storage location is changed to / home directory:
    Insert picture description here
  11. Wait for the service to start:
    Insert picture description here
  12. The start of each service is completed:
    Insert picture description here

YARN settings

The default YARN parameters are very conservative, and some settings need to be made to successfully execute Flink tasks:

  1. Click the red box below to enter the YARN management page:
    Insert picture description here
  2. As shown in the figure below, check the value of the parameter yarn.nodemanager.resource.cpu-vcores , the value must be greater than 1, otherwise YARN does not allocate resources to perform the task after submitting the Flink task, (if your CDH server is a virtual machine, when the CPU only has When single core, this parameter will be set to 1, the solution is to increase the number of virtual machine CPU cores, and then modify this parameter):
    Insert picture description here
  3. yarn.scheduler.minimum-allocation-mb : the minimum memory that can be applied for a single container, I set it to 1G
  4. yarn.scheduler.maximum-allocation-mb : the maximum memory that can be applied for a single container, I set it to 8G
  5. yarn.nodemanager.resource.memory-mb : the maximum available memory of the node, I set it to 8G
  6. The values ​​of the above three parameters are based on the background of my CDH server with 32G memory, please adjust according to your own hardware resources;
  7. After setting up, restart the YARN service, the operation is as shown in the figure below:
    Insert picture description here
    At this point, the deployment and settings have been completed, and the Flink on Yarn environment is available. In the next article, we will submit Flink tasks in this environment to experience Flink on Yarn ;

Welcome to pay attention to my public number: programmer Xinchen

Insert picture description here

Published 376 original articles · praised 986 · 1.28 million views

Guess you like

Origin blog.csdn.net/boling_cavalry/article/details/105356347