[Introduction to Flink] Flink installation and deployment on centos (standalone mode)


basic concepts

Running the Flink application is actually very simple, but before running the Flink application, it is still necessary to understand the various components of the Flink runtime, because this involves the configuration of the Flink application. As shown in Figure 1, this is a data processing program written by the user using the DataStream API. It can be seen that operators that cannot be chained together in a DAG graph will be separated into different tasks, which means that a task is the smallest unit of resource scheduling in Flink.

Insert picture description here

图 1. Parallel Dataflows

As shown in Figure 2 below, the actual operation of Flink includes two types of processes:
JobManager (also known as JobMaster): Coordinating the distributed execution of tasks, including scheduling tasks, coordinating checkpoint creation, and coordinating each task recovery from Checkpoint when Job failover, etc. .

TaskManager (also known as Worker): Execute Tasks in Dataflow, including memory Buffer allocation, Data Stream transmission, etc.

Insert picture description here

Figure 2. Flink Runtime architecture diagram

As shown in Figure 3, Task Slot is the smallest resource allocation unit in a TaskManager. The number of Task Slots in a TaskManager means how many concurrent Task processing can be supported. It should be noted that multiple Operators can be executed in a Task Slot, and generally these Operators can be processed by Chain together.
Insert picture description here

Figure 3. Process

Installation and deployment

Install and deploy Java 8 on centos first

Download the flink installation package

cd /opt/tools

# 下载flink1.12.0版本
wget http://mirrors.estointernet.in/apache/flink/flink-1.12.0/flink-1.12.0-bin-scala_2.11.tgz

# 解压压缩包
tar -zxvf flink-1.12.0-bin-scala_2.11.tgz -C /opt/modules/

Run Flink in standalone mode

The easiest way to run Flink applications is to run in standalone mode.
Start the cluster:

./bin/start-cluster.sh

Open http://127.0.0.1:8081/ to see Flink's web interface.
Insert picture description here

Try to submit the Word Count task:

./bin/flink run examples/streaming/WordCount.jar

You can explore the information displayed in the web interface by yourself. For example, we can look at the stdout log of TaskManager to see the calculation result of Word Count.
We can also try to specify our own local file as input through the "-input" parameter, and then execute:

./bin/flink run examples/streaming/WordCount.jar --input ${your_source_file} --output ${your_sink_file}

Stop the cluster:

./bin/stop-cluster.sh

Common configuration introduction

conf / slaves
conf / slaves is used to configure the deployment of TaskManager. In the default configuration, only one TaskManager process will be started. If you want to add a TaskManager process, you only need to add a line of "localhost" to the file.
You can also directly add a new TaskManager through the command "./bin/taskmanager.sh start ":
./bin/taskmanager.sh start|start-foreground|stop|stop-all
conf/flink-conf.yaml
conf/ flink-conf.yaml is used to configure the operating parameters of JM and TM. Commonly used configurations are:

# The total process memory size for the JobManager.
#
# Note this accounts for all memory usage within the JobManager process, including JVM metaspace and other overhead.
# 包括JobManager进程中的所有内存使用,包括JVM元空间和其他开销
jobmanager.memory.process.size: 1600m


# The total process memory size for the TaskManager.
#
# Note this accounts for all memory usage within the TaskManager process, including JVM metaspace and other overhead.
# 这包括TaskManager进程中的所有内存使用,包括JVM元空间和其他开销
taskmanager.memory.process.size: 1728m

# To exclude JVM metaspace and overhead, please, use total Flink memory size instead of 'taskmanager.memory.process.size'.
# It is not recommended to set both 'taskmanager.memory.process.size' and Flink memory.
# 要排除JVM元空间和开销,请使用总Flink内存大小,而不是'taskmanager.memory.process.size'。不建议同时设置'taskmanager.memory.process. '和Flink内存
# taskmanager.memory.flink.size: 1280m

# The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline.
# 每个任务管理器提供的任务槽数。每个插槽运行一个并行管道
taskmanager.numberOfTaskSlots: 1

# The parallelism used for programs that did not specify and other parallelism.
# 用于未指定程序的并行度和其他并行度
parallelism.default: 1

Log viewing and configuration

The startup logs of JobManager and TaskManager can be found in the Log subdirectory of the Flink binary directory. The files prefixed with "flink-{id}-${hostname}" in the Log directory correspond to the output of JobManager. There are three files:

  • flink-${user}-standalonesession-${id}-${hostname}.log: Log output in the code

  • flink-${user}-standalonesession-${id}-${hostname}.out: Stdout output during process execution

  • flink-${user}-standalonesession-${id}-${hostname}-gc.log: JVM GC log

The files prefixed with "flink-{id}-${hostname}" in the Log directory correspond to the output of the TaskManager, and also include three files, which are consistent with the output of the JobManager.

The log configuration file is in the conf subdirectory of the Flink binary directory, where:

  • log4j-cli.properties: Log configuration used when using the Flink command line, such as executing the "flink run" command

  • log4j-yarn-session.properties: Log configuration used when starting with yarn-session.sh and executing the command line

  • log4j.properties: Regardless of Standalone or Yarn mode, the log configuration used on JobManager and TaskManager is log4j.properties

These three "log4j.*properties" files respectively have three "logback.*xml" files corresponding to them. If you want to use Logback, you only need to delete the corresponding "log4j.*properties" file. The corresponding relationship is as follows:
log4j-cli.properties -> logback-console.xml

log4j-yarn-session.properties -> logback-yarn.xml

log4j.properties -> logback.xml

It should be noted that both "flink-{id}- and {user}-taskexecutor-{hostname}" have ",{id}" indicating that this process is in all processes of the role (JobManager or TaskManager) on this machine The startup sequence of the system starts from 0 by default.

Explore further

Try to execute the "./bin/start-cluster.sh" command repeatedly, and then look at the web page (or execute the jps command) to see what happens? You can try to look at the startup script and analyze the reason. Then you can repeatedly execute "./bin/stop-cluster.sh", and see what happens after each execution.

[fuyun@bigdata-training tools]$ /opt/modules/flink-1.12.0/bin/start-cluster.sh 
Starting cluster.
Starting standalonesession daemon on host bigdata-training.fuyun.com.
Starting taskexecutor daemon on host bigdata-training.fuyun.com.
[fuyun@bigdata-training tools]$ 
[fuyun@bigdata-training tools]$ 
[fuyun@bigdata-training tools]$ /opt/modules/flink-1.12.0/bin/start-cluster.sh 
Starting cluster.
[INFO] 1 instance(s) of standalonesession are already running on bigdata-training.fuyun.com.
Starting standalonesession daemon on host bigdata-training.fuyun.com.
[INFO] 1 instance(s) of taskexecutor are already running on bigdata-training.fuyun.com.
Starting taskexecutor daemon on host bigdata-training.fuyun.com.
[fuyun@bigdata-training tools]$ 
[fuyun@bigdata-training tools]$ 
[fuyun@bigdata-training tools]$ /opt/modules/flink-1.12.0/bin/start-cluster.sh 
Starting cluster.
[INFO] 1 instance(s) of standalonesession are already running on bigdata-training.fuyun.com.
Starting standalonesession daemon on host bigdata-training.fuyun.com.
[INFO] 2 instance(s) of taskexecutor are already running on bigdata-training.fuyun.com.
Starting taskexecutor daemon on host bigdata-training.fuyun.com.
[fuyun@bigdata-training tools]$ 
[fuyun@bigdata-training tools]$ /opt/modules/flink-1.12.0/bin/stop-cluster.sh 
Stopping taskexecutor daemon (pid: 15368) on host bigdata-training.fuyun.com.
No standalonesession daemon (pid: 15053) is running anymore on bigdata-training.fuyun.com.
[fuyun@bigdata-training tools]$ 
[fuyun@bigdata-training tools]$ /opt/modules/flink-1.12.0/bin/stop-cluster.sh 
Stopping taskexecutor daemon (pid: 14508) on host bigdata-training.fuyun.com.
No standalonesession daemon (pid: 14132) is running anymore on bigdata-training.fuyun.com.

Repeated startup will prompt how many standalonesession and taskexecutor instances have been running. Each time you start taskexecutor, one will be added, but standalonesession will not be added. There is always only one instance
that will add TaskManagerRunner process through jps, but StandaloneSessionClusterEntrypoint always has When a process
is repeatedly stopped, it will prompt which taskexecute pid and standalonesession pid to stop.

[fuyun@bigdata-training flink-1.12.0]$ jps
13284 StandaloneSessionClusterEntrypoint
39348 DataNode
38374 NodeManager
6007 JobHistoryServer
39289 NameNode
13595 TaskManagerRunner
14508 TaskManagerRunner
16364 QuorumPeerMain
2894 RunJar
14671 Jps
38319 ResourceManager
[fuyun@bigdata-training flink-1.12.0]$ jps
13284 StandaloneSessionClusterEntrypoint
39348 DataNode
38374 NodeManager
6007 JobHistoryServer
15368 TaskManagerRunner
39289 NameNode
13595 TaskManagerRunner
14508 TaskManagerRunner
16364 QuorumPeerMain
2894 RunJar
15439 Jps
38319 ResourceManager

When observing the file situation in the log directory, the id after standalonesession and taskexecutor will increase every time it is started:

[fuyun@bigdata-training flink-1.12.0]$ ls log/
flink-fuyun-standalonesession-0-bigdata-training.fuyun.com.log  flink-fuyun-standalonesession-2-bigdata-training.fuyun.com.log  flink-fuyun-taskexecutor-1-bigdata-training.fuyun.com.log
flink-fuyun-standalonesession-0-bigdata-training.fuyun.com.out  flink-fuyun-standalonesession-2-bigdata-training.fuyun.com.out  flink-fuyun-taskexecutor-1-bigdata-training.fuyun.com.out
flink-fuyun-standalonesession-1-bigdata-training.fuyun.com.log  flink-fuyun-taskexecutor-0-bigdata-training.fuyun.com.log       flink-fuyun-taskexecutor-2-bigdata-training.fuyun.com.log
flink-fuyun-standalonesession-1-bigdata-training.fuyun.com.out  flink-fuyun-taskexecutor-0-bigdata-training.fuyun.com.out       flink-fuyun-taskexecutor-2-bigdata-training.fuyun.com.out

Repeat the start-stop and start-stop operation, you will find that the .idsuffix will be added after the log of standalonesession and taskexecutor , and the id will increase according to the number of repeated operations

[fuyun@bigdata-training flink-1.12.0]$ ls log/
flink-fuyun-standalonesession-0-bigdata-training.fuyun.com.log    flink-fuyun-standalonesession-0-bigdata-training.fuyun.com.out  flink-fuyun-taskexecutor-0-bigdata-training.fuyun.com.log.1
flink-fuyun-standalonesession-0-bigdata-training.fuyun.com.log.1  flink-fuyun-taskexecutor-0-bigdata-training.fuyun.com.log       flink-fuyun-taskexecutor-0-bigdata-training.fuyun.com.out

Guess you like

Origin blog.csdn.net/lz6363/article/details/112468441