Flink 1.17 tutorial: cluster construction, operation mode (standalone/yarn/k8s) and history server

cluster role

img

cluster start

If it is deployed locally and accessed locally, no configuration is required, and it can be started directly.

If it is deployed on the server and requires remote access, you need to flink.confmodify the localhost in the server IP address or0.0.0.0

node server hadoop102 hadoop103 hadoop104
Role JobManagerTaskManager TaskManager TaskManager
[atguigu@node001 module]$ cd flink
[atguigu@node001 flink]$ cd flink-1.17.0/
[atguigu@node001 flink-1.17.0]$ bin/start-cluster.sh
Starting cluster.
Starting standalonesession daemon on host node001.
Starting taskexecutor daemon on host node001.
Starting taskexecutor daemon on host node002.
Starting taskexecutor daemon on host node003.
[atguigu@node001 flink-1.17.0]$ jpsall 
================ node001 ================
3408 Jps
2938 StandaloneSessionClusterEntrypoint
3276 TaskManagerRunner
================ node002 ================
2852 TaskManagerRunner
2932 Jps
================ node003 ================
2864 TaskManagerRunner
2944 Jps
[atguigu@node001 flink-1.17.0]$ 

img

WebUI submit job

Make jar package maven plug-in configuration

<build>
    <plugins>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-shade-plugin</artifactId>
            <version>3.2.4</version>
            <executions>
                <execution>
                    <phase>package</phase>
                    <goals>
                        <goal>shade</goal>
                    </goals>
                    <configuration>
                        <artifactSet>
                            <excludes>
                                <exclude>com.google.code.findbugs:jsr305</exclude>
                                <exclude>org.slf4j:*</exclude>
                                <exclude>log4j:*</exclude>
                            </excludes>
                        </artifactSet>
                        <filters>
                            <filter>
                                <!-- Do not copy the signatures in the META-INF folder.
                                Otherwise, this might cause SecurityExceptions when using the JAR. -->
                                <artifact>*:*</artifact>
                                <excludes>
                                    <exclude>META-INF/*.SF</exclude>
                                    <exclude>META-INF/*.DSA</exclude>
                                    <exclude>META-INF/*.RSA</exclude>
                                </excludes>
                            </filter>
                        </filters>
                        <transformers combine.children="append">
                            <transformer
                                    implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer">
                            </transformer>
                        </transformers>
                    </configuration>
                </execution>
            </executions>
        </plugin>
    </plugins>
</build>

img

com.atguigu.wc.WordCountStreamUnboundedDemo

img

img

img

img

Submit job from command line

bin/flink run -m node001:8081 -c com.atguigu.wc.WordCountStreamUnboundedDemo ../jar/FlinkTutorial-1.17-1.0-SNAPSHOT.jar

连接成功
Last login: Fri Jun 16 14:44:01 2023 from 192.168.10.1
[atguigu@node001 ~]$ cd /opt/module/flink/flink-1.17.0/

[atguigu@node001 flink-1.17.0]$ cd bin

[atguigu@node001 bin]$ ./start-cluster.sh 
Starting cluster.
Starting standalonesession daemon on host node001.
Starting taskexecutor daemon on host node001.
Starting taskexecutor daemon on host node002.
Starting taskexecutor daemon on host node003.

[atguigu@node001 bin]$ jpsall 
================ node001 ================
2723 TaskManagerRunner
2855 Jps
2380 StandaloneSessionClusterEntrypoint
================ node002 ================
2294 TaskManagerRunner
2367 Jps
================ node003 ================
2292 TaskManagerRunner
2330 Jps

[atguigu@node001 bin]$ cd ..
         
[atguigu@node001 flink-1.17.0]$ bin/flink run -m node001:8081 -c com.atguigu.wc.WordCountStreamUnboundedDemo ../jar/FlinkTutorial-1.17-1.0-SNAPSHOT.jar 
Job has been submitted with JobID 59ae9d6532523b0c48cdb8b6c9105356

img

img

img

Deployment Mode Introduction

In some application scenarios, there may be specific requirements for the allocation and occupation of cluster resources. Flink provides different deployment modes for various scenarios, mainly in the following three types: Session Mode, Per-Job Mode, and Application Mode.

The main difference between them lies in: the life cycle of the cluster and the allocation method of resources; and where the main method of the application is executed-the client (Client) or the JobManager.

Standalone mode of operation

The independent mode runs independently without relying on any external resource management platform; of course, independence also comes at a price: if resources are insufficient or fail, there is no guarantee of automatic expansion or reallocation of resources, and must be handled manually. Therefore, the stand-alone mode is generally only used in development testing or scenarios with very few jobs.

Short version script:

bin/standalone-job.sh start --job-classname com.atguigu.wc.WordCountStreamUnboundedDemo

bin/taskmanager.sh start

bin/taskmanager.sh stop

bin/standalone-job.sh stop

Detailed display version:

[atguigu@node001 ~]$ cd /opt/module/flink/flink-1.17.0/bin

[atguigu@node001 bin]$ ./stop-cluster.sh 
Stopping taskexecutor daemon (pid: 2723) on host node001.
Stopping taskexecutor daemon (pid: 2294) on host node002.
Stopping taskexecutor daemon (pid: 2292) on host node003.
Stopping standalonesession daemon (pid: 2380) on host node001.

[atguigu@node001 bin]$ jpsall 
================ node001 ================
5120 Jps
================ node002 ================
3212 Jps
================ node003 ================
3159 Jps

[atguigu@node001 bin]$ ls
bash-java-utils.jar  flink             historyserver.sh          kubernetes-session.sh      sql-client.sh      start-cluster.sh           stop-zookeeper-quorum.sh  zookeeper.sh
config.sh            flink-console.sh  jobmanager.sh             kubernetes-taskmanager.sh  sql-gateway.sh     start-zookeeper-quorum.sh  taskmanager.sh
find-flink-home.sh   flink-daemon.sh   kubernetes-jobmanager.sh  pyflink-shell.sh           standalone-job.sh  stop-cluster.sh            yarn-session.sh

[atguigu@node001 bin]$ cd ../lib/

[atguigu@node001 lib]$ ls
flink-cep-1.17.0.jar              flink-dist-1.17.0.jar        flink-table-api-java-uber-1.17.0.jar   FlinkTutorial-1.17-1.0-SNAPSHOT.jar  log4j-core-2.17.1.jar
flink-connector-files-1.17.0.jar  flink-json-1.17.0.jar        flink-table-planner-loader-1.17.0.jar  log4j-1.2-api-2.17.1.jar             log4j-slf4j-impl-2.17.1.jar
flink-csv-1.17.0.jar              flink-scala_2.12-1.17.0.jar  flink-table-runtime-1.17.0.jar         log4j-api-2.17.1.jar

[atguigu@node001 lib]$ cd ../

[atguigu@node001 flink-1.17.0]$ bin/standalone-job.sh start --job-classname com.atguigu.wc.WordCountStreamUnboundedDemo
Starting standalonejob daemon on host node001.

[atguigu@node001 flink-1.17.0]$ jpsall 
================ node001 ================
5491 StandaloneApplicationClusterEntryPoint
5583 Jps
================ node002 ================
3326 Jps
================ node003 ================
3307 Jps

[atguigu@node001 flink-1.17.0]$ bin/taskmanager.sh 
Usage: taskmanager.sh (start|start-foreground|stop|stop-all)

[atguigu@node001 flink-1.17.0]$ bin/taskmanager.sh start
Starting taskexecutor daemon on host node001.

[atguigu@node001 flink-1.17.0]$ jpsall 
================ node001 ================
5491 StandaloneApplicationClusterEntryPoint
5995 Jps
5903 TaskManagerRunner
================ node002 ================
3363 Jps
================ node003 ================
3350 Jps

[atguigu@node001 flink-1.17.0]$ bin/taskmanager.sh stop
Stopping taskexecutor daemon (pid: 5903) on host node001.

[atguigu@node001 flink-1.17.0]$ bin/standalone-job.sh stop
No standalonejob daemon (pid: 5491) is running anymore on node001.

[atguigu@node001 flink-1.17.0]$ xcall jps
=============== node001 ===============
6682 Jps
=============== node002 ===============
3429 Jps
=============== node003 ===============
3419 Jps

YARN operation mode_environment preparation

The deployment process on YARN is: the client submits the Flink application to Yarn's ResourceManager, and Yarn's ResourceManager will apply for a container from Yarn's NodeManager. On these containers, Flink will deploy instances of JobManager and TaskManager, thereby starting the cluster. Flink will dynamically allocate TaskManager resources according to the number of Slots required by the jobs running on JobManger.

[atguigu@node001 flink-1.17.0]$ source /etc/profile.d/my_env.sh 
[atguigu@node001 flink-1.17.0]$ myhadoop.sh s
Input Args Error...
[atguigu@node001 flink-1.17.0]$ myhadoop.sh start
 ================ 启动 hadoop集群 ================
 ---------------- 启动 hdfs ----------------
Starting namenodes on [node001]
Starting datanodes
Starting secondary namenodes [node003]
 --------------- 启动 yarn ---------------
Starting resourcemanager
Starting nodemanagers
 --------------- 启动 historyserver ---------------
[atguigu@node001 flink-1.17.0]$ jpsall 
================ node001 ================
9200 JobHistoryServer
8416 NameNode
8580 DataNode
9284 Jps
8983 NodeManager
================ node002 ================
3892 ResourceManager
3690 DataNode
4365 Jps
4015 NodeManager
================ node003 ================
3680 DataNode
3778 SecondaryNameNode
3911 NodeManager
4044 Jps
[atguigu@node001 flink-1.17.0]$ 

YARN running mode_session mode

This command is a script for starting Apache Flink's YARN session (session). The meaning of each option and parameter is as follows:

  • yarn-session.sh: This is the script provided by Apache Flink to start a session on YARN.
  • -d: This is an option that starts the session in detached mode. In detached mode, the session runs in the background and the script returns immediately.
  • -nm test: This is another option to specify the name of the session. In this example, the name of the session is set to "test".

Taken together, the purpose of this command is to start an Apache Flink session named "test" on YARN, running in detached mode. Once started, the session runs in the background and the command line prompt returns immediately, allowing you to continue with other operations.

[atguigu@node001 bin]$ ./yarn-session.sh --help
[atguigu@node001 bin]$ ./yarn-session.sh
[atguigu@node001 bin]$ ./yarn-session.sh -d -nm test

img

img

img

img

YARN operation mode_session mode stop

Single job mode deployment

In the YARN environment, since there is an external platform for resource scheduling, we can also submit a single job directly to YARN to start a Flink cluster.

img

Stop job:

img

YARN running mode_single job mode

Single job mode deployment

(1) Execute the command to submit the job

img

YARN operation mode_application mode

The application mode is also very simple, similar to the single job mode, directly execute

flink run-application

Just order. like:

bin/flink run-application -t yarn-application -c com.atguigu.wc.WordCountStreamUnboundedDemo ./FlinkTutorial-1.17-1.0-SNAPSHOT.jar 
[atguigu@node001 flink-1.17.0]$ bin/flink run-application -t yarn-application -c com.atguigu.wc.WordCountStreamUnboundedDemo ./FlinkTutorial-1.17-1.0-SNAPSHOT.jar 

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/module/flink/flink-1.17.0/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/module/hadoop/hadoop-3.1.3/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2023-06-19 14:31:05,693 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                [] - Found Yarn properties file under /tmp/.yarn-properties-atguigu.
2023-06-19 14:31:05,693 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                [] - Found Yarn properties file under /tmp/.yarn-properties-atguigu.
2023-06-19 14:31:06,142 WARN  org.apache.flink.yarn.configuration.YarnLogConfigUtil        [] - The configuration directory ('/opt/module/flink/flink-1.17.0/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
2023-06-19 14:31:06,632 INFO  org.apache.hadoop.yarn.client.RMProxy                        [] - Connecting to ResourceManager at node002/192.168.10.102:8032
2023-06-19 14:31:07,195 INFO  org.apache.flink.yarn.YarnClusterDescriptor                  [] - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar

imgimg

The case where jar is stored in hdfs:

[atguigu@node001 flink-1.17.0]$ bin/flink run-application -t yarn-application -Dyarn.provided.lib.dirs="hdfs://node001:8020/flink-dist" -c com.atguigu.wc.WordCountStreamUnboundedDemo hdfs://node001:8020/flink-jars/FlinkTutorial-1.17-1.0-SNAPSHOT.jar

K8S running mode

Containerized deployment is a popular technology in the industry today, and running based on Docker images can make it easier for users to manage and maintain applications. The most popular container management tool is Kubernetes (k8s), and Flink also supports the k8s deployment mode in recent versions. The basic principle is similar to that of YARN. For specific configuration, please refer to the official website description, and we will not explain too much here.

History Server History Server

Once the cluster running the Flink job is stopped, you can only check the logs on yarn or the local disk, and you can no longer view the running Web UI before the job hangs up. It is difficult to know exactly what happened at the moment the job hung up. If we don't have Metrics monitoring, we can only analyze and locate problems through logs, so if we can restore the previous Web UI, we can find and locate some problems through the UI.

Flink provides a history server, which is used to query the statistics of completed jobs after the corresponding Flink cluster is shut down. We all know that only when the job is running, can we view the relevant WebUI statistics. Through this History Serverwe can query the statistics of these completed jobs, whether they exit normally or exit abnormally.

Additionally, it exposes a REST API that accepts HTTP requests and responds with JSON data. After the Flink task is stopped, the JobManager will archive the statistical information of the completed task, and the History Server process can query the task statistical information after the task is stopped. For example: the last Checkpoint, related configuration when the task is running.

img

Start the historyserver:

[atguigu@node001 flink-1.17.0]$ bin/historyserver.sh start

Starting historyserver daemon on host node001.

[atguigu@node001 flink-1.17.0]$ bin/flink run -t yarn-per-job -d -c com.atguigu.wc.WordCountStreamUnboundedDemo ../jar/FlinkTutorial-1.17-1.0-SNAPSHOT.jar

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/module/flink/flink-1.17.0/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]

img


links:

https://upward.blog.csdn.net/article/details/131215329

Guess you like

Origin blog.csdn.net/a772304419/article/details/132624554