cluster role
cluster start
If it is deployed locally and accessed locally, no configuration is required, and it can be started directly.
If it is deployed on the server and requires remote access, you need to
flink.conf
modify the localhost in the server IP address or0.0.0.0
node server | hadoop102 | hadoop103 | hadoop104 |
---|---|---|---|
Role | JobManagerTaskManager | TaskManager | TaskManager |
[atguigu@node001 module]$ cd flink
[atguigu@node001 flink]$ cd flink-1.17.0/
[atguigu@node001 flink-1.17.0]$ bin/start-cluster.sh
Starting cluster.
Starting standalonesession daemon on host node001.
Starting taskexecutor daemon on host node001.
Starting taskexecutor daemon on host node002.
Starting taskexecutor daemon on host node003.
[atguigu@node001 flink-1.17.0]$ jpsall
================ node001 ================
3408 Jps
2938 StandaloneSessionClusterEntrypoint
3276 TaskManagerRunner
================ node002 ================
2852 TaskManagerRunner
2932 Jps
================ node003 ================
2864 TaskManagerRunner
2944 Jps
[atguigu@node001 flink-1.17.0]$
WebUI submit job
Make jar package maven plug-in configuration
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.2.4</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<artifactSet>
<excludes>
<exclude>com.google.code.findbugs:jsr305</exclude>
<exclude>org.slf4j:*</exclude>
<exclude>log4j:*</exclude>
</excludes>
</artifactSet>
<filters>
<filter>
<!-- Do not copy the signatures in the META-INF folder.
Otherwise, this might cause SecurityExceptions when using the JAR. -->
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
<transformers combine.children="append">
<transformer
implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer">
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
com.atguigu.wc.WordCountStreamUnboundedDemo
Submit job from command line
bin/flink run -m node001:8081 -c com.atguigu.wc.WordCountStreamUnboundedDemo ../jar/FlinkTutorial-1.17-1.0-SNAPSHOT.jar
连接成功
Last login: Fri Jun 16 14:44:01 2023 from 192.168.10.1
[atguigu@node001 ~]$ cd /opt/module/flink/flink-1.17.0/
[atguigu@node001 flink-1.17.0]$ cd bin
[atguigu@node001 bin]$ ./start-cluster.sh
Starting cluster.
Starting standalonesession daemon on host node001.
Starting taskexecutor daemon on host node001.
Starting taskexecutor daemon on host node002.
Starting taskexecutor daemon on host node003.
[atguigu@node001 bin]$ jpsall
================ node001 ================
2723 TaskManagerRunner
2855 Jps
2380 StandaloneSessionClusterEntrypoint
================ node002 ================
2294 TaskManagerRunner
2367 Jps
================ node003 ================
2292 TaskManagerRunner
2330 Jps
[atguigu@node001 bin]$ cd ..
[atguigu@node001 flink-1.17.0]$ bin/flink run -m node001:8081 -c com.atguigu.wc.WordCountStreamUnboundedDemo ../jar/FlinkTutorial-1.17-1.0-SNAPSHOT.jar
Job has been submitted with JobID 59ae9d6532523b0c48cdb8b6c9105356
Deployment Mode Introduction
In some application scenarios, there may be specific requirements for the allocation and occupation of cluster resources. Flink provides different deployment modes for various scenarios, mainly in the following three types: Session Mode, Per-Job Mode, and Application Mode.
The main difference between them lies in: the life cycle of the cluster and the allocation method of resources; and where the main method of the application is executed-the client (Client) or the JobManager.
Standalone mode of operation
The independent mode runs independently without relying on any external resource management platform; of course, independence also comes at a price: if resources are insufficient or fail, there is no guarantee of automatic expansion or reallocation of resources, and must be handled manually. Therefore, the stand-alone mode is generally only used in development testing or scenarios with very few jobs.
Short version script:
bin/standalone-job.sh start --job-classname com.atguigu.wc.WordCountStreamUnboundedDemo
bin/taskmanager.sh start
bin/taskmanager.sh stop
bin/standalone-job.sh stop
Detailed display version:
[atguigu@node001 ~]$ cd /opt/module/flink/flink-1.17.0/bin
[atguigu@node001 bin]$ ./stop-cluster.sh
Stopping taskexecutor daemon (pid: 2723) on host node001.
Stopping taskexecutor daemon (pid: 2294) on host node002.
Stopping taskexecutor daemon (pid: 2292) on host node003.
Stopping standalonesession daemon (pid: 2380) on host node001.
[atguigu@node001 bin]$ jpsall
================ node001 ================
5120 Jps
================ node002 ================
3212 Jps
================ node003 ================
3159 Jps
[atguigu@node001 bin]$ ls
bash-java-utils.jar flink historyserver.sh kubernetes-session.sh sql-client.sh start-cluster.sh stop-zookeeper-quorum.sh zookeeper.sh
config.sh flink-console.sh jobmanager.sh kubernetes-taskmanager.sh sql-gateway.sh start-zookeeper-quorum.sh taskmanager.sh
find-flink-home.sh flink-daemon.sh kubernetes-jobmanager.sh pyflink-shell.sh standalone-job.sh stop-cluster.sh yarn-session.sh
[atguigu@node001 bin]$ cd ../lib/
[atguigu@node001 lib]$ ls
flink-cep-1.17.0.jar flink-dist-1.17.0.jar flink-table-api-java-uber-1.17.0.jar FlinkTutorial-1.17-1.0-SNAPSHOT.jar log4j-core-2.17.1.jar
flink-connector-files-1.17.0.jar flink-json-1.17.0.jar flink-table-planner-loader-1.17.0.jar log4j-1.2-api-2.17.1.jar log4j-slf4j-impl-2.17.1.jar
flink-csv-1.17.0.jar flink-scala_2.12-1.17.0.jar flink-table-runtime-1.17.0.jar log4j-api-2.17.1.jar
[atguigu@node001 lib]$ cd ../
[atguigu@node001 flink-1.17.0]$ bin/standalone-job.sh start --job-classname com.atguigu.wc.WordCountStreamUnboundedDemo
Starting standalonejob daemon on host node001.
[atguigu@node001 flink-1.17.0]$ jpsall
================ node001 ================
5491 StandaloneApplicationClusterEntryPoint
5583 Jps
================ node002 ================
3326 Jps
================ node003 ================
3307 Jps
[atguigu@node001 flink-1.17.0]$ bin/taskmanager.sh
Usage: taskmanager.sh (start|start-foreground|stop|stop-all)
[atguigu@node001 flink-1.17.0]$ bin/taskmanager.sh start
Starting taskexecutor daemon on host node001.
[atguigu@node001 flink-1.17.0]$ jpsall
================ node001 ================
5491 StandaloneApplicationClusterEntryPoint
5995 Jps
5903 TaskManagerRunner
================ node002 ================
3363 Jps
================ node003 ================
3350 Jps
[atguigu@node001 flink-1.17.0]$ bin/taskmanager.sh stop
Stopping taskexecutor daemon (pid: 5903) on host node001.
[atguigu@node001 flink-1.17.0]$ bin/standalone-job.sh stop
No standalonejob daemon (pid: 5491) is running anymore on node001.
[atguigu@node001 flink-1.17.0]$ xcall jps
=============== node001 ===============
6682 Jps
=============== node002 ===============
3429 Jps
=============== node003 ===============
3419 Jps
YARN operation mode_environment preparation
The deployment process on YARN is: the client submits the Flink application to Yarn's ResourceManager, and Yarn's ResourceManager will apply for a container from Yarn's NodeManager. On these containers, Flink will deploy instances of JobManager and TaskManager, thereby starting the cluster. Flink will dynamically allocate TaskManager resources according to the number of Slots required by the jobs running on JobManger.
[atguigu@node001 flink-1.17.0]$ source /etc/profile.d/my_env.sh
[atguigu@node001 flink-1.17.0]$ myhadoop.sh s
Input Args Error...
[atguigu@node001 flink-1.17.0]$ myhadoop.sh start
================ 启动 hadoop集群 ================
---------------- 启动 hdfs ----------------
Starting namenodes on [node001]
Starting datanodes
Starting secondary namenodes [node003]
--------------- 启动 yarn ---------------
Starting resourcemanager
Starting nodemanagers
--------------- 启动 historyserver ---------------
[atguigu@node001 flink-1.17.0]$ jpsall
================ node001 ================
9200 JobHistoryServer
8416 NameNode
8580 DataNode
9284 Jps
8983 NodeManager
================ node002 ================
3892 ResourceManager
3690 DataNode
4365 Jps
4015 NodeManager
================ node003 ================
3680 DataNode
3778 SecondaryNameNode
3911 NodeManager
4044 Jps
[atguigu@node001 flink-1.17.0]$
YARN running mode_session mode
This command is a script for starting Apache Flink's YARN session (session). The meaning of each option and parameter is as follows:
yarn-session.sh
: This is the script provided by Apache Flink to start a session on YARN.-d
: This is an option that starts the session in detached mode. In detached mode, the session runs in the background and the script returns immediately.-nm test
: This is another option to specify the name of the session. In this example, the name of the session is set to "test".
Taken together, the purpose of this command is to start an Apache Flink session named "test" on YARN, running in detached mode. Once started, the session runs in the background and the command line prompt returns immediately, allowing you to continue with other operations.
[atguigu@node001 bin]$ ./yarn-session.sh --help
[atguigu@node001 bin]$ ./yarn-session.sh
[atguigu@node001 bin]$ ./yarn-session.sh -d -nm test
YARN operation mode_session mode stop
Single job mode deployment
In the YARN environment, since there is an external platform for resource scheduling, we can also submit a single job directly to YARN to start a Flink cluster.
Stop job:
YARN running mode_single job mode
Single job mode deployment
(1) Execute the command to submit the job
YARN operation mode_application mode
The application mode is also very simple, similar to the single job mode, directly execute
flink run-application
Just order. like:
bin/flink run-application -t yarn-application -c com.atguigu.wc.WordCountStreamUnboundedDemo ./FlinkTutorial-1.17-1.0-SNAPSHOT.jar
[atguigu@node001 flink-1.17.0]$ bin/flink run-application -t yarn-application -c com.atguigu.wc.WordCountStreamUnboundedDemo ./FlinkTutorial-1.17-1.0-SNAPSHOT.jar
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/module/flink/flink-1.17.0/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/module/hadoop/hadoop-3.1.3/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2023-06-19 14:31:05,693 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli [] - Found Yarn properties file under /tmp/.yarn-properties-atguigu.
2023-06-19 14:31:05,693 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli [] - Found Yarn properties file under /tmp/.yarn-properties-atguigu.
2023-06-19 14:31:06,142 WARN org.apache.flink.yarn.configuration.YarnLogConfigUtil [] - The configuration directory ('/opt/module/flink/flink-1.17.0/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
2023-06-19 14:31:06,632 INFO org.apache.hadoop.yarn.client.RMProxy [] - Connecting to ResourceManager at node002/192.168.10.102:8032
2023-06-19 14:31:07,195 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
The case where jar is stored in hdfs:
[atguigu@node001 flink-1.17.0]$ bin/flink run-application -t yarn-application -Dyarn.provided.lib.dirs="hdfs://node001:8020/flink-dist" -c com.atguigu.wc.WordCountStreamUnboundedDemo hdfs://node001:8020/flink-jars/FlinkTutorial-1.17-1.0-SNAPSHOT.jar
K8S running mode
Containerized deployment is a popular technology in the industry today, and running based on Docker images can make it easier for users to manage and maintain applications. The most popular container management tool is Kubernetes (k8s), and Flink also supports the k8s deployment mode in recent versions. The basic principle is similar to that of YARN. For specific configuration, please refer to the official website description, and we will not explain too much here.
History Server History Server
Once the cluster running the Flink job is stopped, you can only check the logs on yarn or the local disk, and you can no longer view the running Web UI before the job hangs up. It is difficult to know exactly what happened at the moment the job hung up. If we don't have Metrics monitoring, we can only analyze and locate problems through logs, so if we can restore the previous Web UI, we can find and locate some problems through the UI.
Flink provides a history server, which is used to query the statistics of completed jobs after the corresponding Flink cluster is shut down. We all know that only when the job is running, can we view the relevant WebUI statistics. Through this History Server
we can query the statistics of these completed jobs, whether they exit normally or exit abnormally.
Additionally, it exposes a REST API that accepts HTTP requests and responds with JSON data. After the Flink task is stopped, the JobManager will archive the statistical information of the completed task, and the History Server process can query the task statistical information after the task is stopped. For example: the last Checkpoint, related configuration when the task is running.
Start the historyserver:
[atguigu@node001 flink-1.17.0]$ bin/historyserver.sh start
Starting historyserver daemon on host node001.
[atguigu@node001 flink-1.17.0]$ bin/flink run -t yarn-per-job -d -c com.atguigu.wc.WordCountStreamUnboundedDemo ../jar/FlinkTutorial-1.17-1.0-SNAPSHOT.jar
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/module/flink/flink-1.17.0/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
links: