How to migrate Flink tasks to real-time computing

This article is shared by Alibaba technical expert Jing Lining (Yantian), and mainly introduces how to migrate Flink tasks to real-time computing Flink. Content is divided into the following sections:
How to Migrate
multi-jar
profile
state multiplexing

Usually users mainly use Flink run offline, which will cause some problems, such as: the same configuration changes due to version; it is not possible to quickly switch between different versions; it cannot be restored from previous jobs.

So how to migrate from offline to online? This article will introduce to you through the following four parts.

1. How to migrate: from Flink run to streaming computing platform

The following figure shows how to migrate offline commands to online. First, open the VVP UI and add basic configuration, including: job name, Jar URI, some main parameters and parallelism. You can also click Advanced Configuration to configure more information.

For example, the behavior configuration in the advanced configuration is divided into three parts: upgrade strategy, initial state and recovery strategy. The upgrade strategy generally chooses Stateless, the initial state chooses Running, and the recovery strategy chooses Latest State.

Select Stateless for the upgrade strategy, which means that after editing and saving the configuration for a job that is already running, the original job will be stopped directly, and then an updated job will be restarted; if you choose Stateble, edit and save the configuration for the running job, The job will first do a Savepoint, then load the new configuration and use the Savepoint to start a new job.

The recovery strategy selection LatestSavepoint means that if the job is restarted when it is suspended, it will start with the latest Savepoint.

The above is the more important part of the Flink configuration. The picture below shows the other part of the configuration that can set the Checkpoint interval. Next is the resource configuration. In the log configuration section, you can select the location where the log is saved, so that if there is a problem with the job in the future, you can easily troubleshoot the problem.

After the Flink job is configured and started to run, if an abnormality occurs, you can check the running status and problems through the running event. You can also open the Job Manager of Flink UI to view the log.

2. In the case of multiple Jars, how do users add other dependent Jars?

Some users have custom dependencies and cannot solve the problem through fat jar. For example, user A has his own main Jar and some other Jars. The application scenarios of these Jars are different. Then you first need to upload the Jar package on the resource management page. After uploading, you can use the file on the page.

After uploading, go to the advanced configuration, find "Additional Dependent Jar", drop down and select the Jar package just uploaded.

Three, how users define jobs through configuration files

Upload the file in the resource management, and then select add dependency in the advanced configuration, and select the required dependency. If in the startup function, the main class needs to read a certain file, first add the file through dependency, and then read the file according to the prompt.

The above two methods are the ways to place the user Jar. Jars and other files can be stored here.

Fourth, how to reuse the original state to accelerate job recovery

If you find a problem while running a job, you usually stop it and then restart it after finding the problem fixed. The user needs to specify the OSS required by Checkpoint when creating it. If Checkpoint is enabled, subsequent operations can be directly restored from Checkpoint.

First confirm that the recovery strategy in the advanced configuration selects LatestState or LatestSavepoint. Under this premise, the job can be paused (don't stop the job), and the pause will trigger the Savepoint. When you click Start again, it will resume from this Savepoint.

Open the Flink UI and click on the job snapshot. In the job snapshot, you can view the Flink Checkpoint indicator, which will display the number of restores and the address of the last restore. You can confirm that the job is restored from the latest State.

When the job is running and I want to copy the current job, how do I do it? Click Checkpoint on the job control page, find the historical snapshot and click on the corresponding "copy job from this snapshot" on the far right side, then the copying of the current job is completed, and the copy content includes the Jar package configuration of the running job, etc. After clicking Start, you can see from the job snapshot that the new job was copied and started from the most recent Savepoint of the previous job.

Author: Ali Baba technical experts Jing Li-Ning (Yantian)

Original link

This article is the original content of Alibaba Cloud and may not be reproduced without permission.

Guess you like

Origin blog.csdn.net/weixin_43970890/article/details/112917044