Flink 1.17 Tutorial: Relationship between Task Slots and Parallelism

Task slot Task Slots

In Apache Flink, task slots (Task Slots) refer to resource units that can be used to execute parallel tasks. Each task slot can be viewed as an available execution thread or processing unit for executing different parts of a job in parallel.

In layman's terms, the task slot can be imagined as a workbench, and each workbench can carry out a task at the same time. The number of task slots determines the number of tasks that can be executed at the same time.

The role and application scenarios of task slots:

  1. Parallel execution: Task slots allow multiple tasks to be executed in parallel at the same time. Each task slot can independently execute a task or operator, thereby effectively utilizing computing resources and improving job parallelism and overall processing capability.
  2. Task allocation and load balancing: Task slots can be used to allocate different tasks or operators to different resources. By properly allocating task slots, load balancing can be achieved to ensure that tasks are evenly distributed among available resources, avoiding resource waste and bottlenecks.
  3. Fault tolerance and high availability: Task slots can also provide fault tolerance and high availability. If a task or operator on a task slot fails, the system can reassign it to other available task slots, so as to achieve fault recovery and ensure the continuous execution of jobs.

In general, task slots play the role of parallel execution tasks in Apache Flink, which can improve the parallelism and overall processing capacity of jobs. They are used for task distribution, load balancing, and implementation of fault tolerance and high availability.

img

Shared group for task slots

img

By default, Flink allows subtasks to share slots. If we keep the parallelism of the sink task as 1, and set the global parallelism as 6 when the job is submitted, then the first two task nodes will each have 6 parallel subtasks, and the entire stream processing program will have 13 subtasks. As shown in the figure above, as long as they belong to the same job, parallel subtasks of different task nodes (operators) can be executed on the same slot. So for the first task node source→map, its six parallel subtasks must be assigned to different slots, while the parallel subtasks of the second task node keyBy/window/apply can be shared with the first task node slot.

package com.atguigu.wc;
 
import org.apache.flink.api.common.typeinfo.Types;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.util.Collector;
 
/**
 * TODO DataStream实现Wordcount:读socket(无界流)
 *
 * @author
 * @version 1.0
 */
public class SlotSharingGroupDemo {
    
    
    public static void main(String[] args) throws Exception {
    
    
        // TODO 1.创建执行环境
		// StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        // IDEA运行时,也可以看到webui,一般用于本地测试
        // 需要引入一个依赖 flink-runtime-web
        StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironmentWithWebUI(new Configuration());
 
        // 在idea运行,不指定并行度,默认就是 电脑的 线程数
        env.setParallelism(1);
 
        // TODO 2.读取数据:socket
        DataStreamSource<String> socketDS = env.socketTextStream("hadoop102", 7777);
 
        // TODO 3.处理数据: 切换、转换、分组、聚合
        SingleOutputStreamOperator<Tuple2<String,Integer>> sum = socketDS
                .flatMap(
                        (String value, Collector<String> out) -> {
    
    
                            String[] words = value.split(" ");
                            for (String word : words) {
    
    
                                out.collect(word);
                            }
                        }
                )
                .returns(Types.STRING)
                .map(word -> Tuple2.of(word, 1)).slotSharingGroup("aaa")
                .returns(Types.TUPLE(Types.STRING,Types.INT))
                .keyBy(value -> value.f0)
                .sum(1);
 
 
        // TODO 4.输出
        sum.print();
 
        // TODO 5.执行
        env.execute();
    }
}
 
/**
 1、slot特点:
    1)均分隔离内存,不隔离cpu
    2)可以共享:
          同一个job中,不同算子的子任务 才可以共享 同一个slot,同时在运行的前提是,属于同一个 slot共享组,默认都是“default”
 2、slot数量 与 并行度 的关系
    1)slot是一种静态的概念,表示最大的并发上限
       并行度是一种动态的概念,表示 实际运行 占用了 几个
    2)要求: slot数量 >= job并行度(算子最大并行度),job才能运行
       TODO 注意:如果是yarn模式,动态申请
         --> TODO 申请的TM数量 = job并行度 / 每个TM的slot数,向上取整
       比如session: 一开始 0个TaskManager,0个slot
         --> 提交一个job,并行度10
            --> 10/3,向上取整,申请4个tm,
            --> 使用10个slot,剩余2个slot
 */

The relationship between slot and parallelism & demonstration

The relationship between task slots and parallelism

Both task slots and parallelism are related to the parallel execution of programs, but they are completely different concepts. To put it simply, task slot is a static concept, which refers to TaskManagerthe concurrent execution capability, which can taskmanager.numberOfTaskSlotsbe configured through parameters; while parallelism is a dynamic concept, that is, TaskManagerthe actual concurrency capability used when running a program, which can be parallelism.defaultconfigured through parameters.

1. Slot features:
1) Evenly share isolated memory, not isolated cpu
2) Can be shared:
In the same job, subtasks of different operators can share the same slot, and the premise of running at the same time is that they belong to the same slot. A slot sharing group, the default is "default"
2. The relationship between the number of slots and the degree of parallelism
​ 1) slot is a static concept, which means the maximum concurrency limit
​ Parallelism is a dynamic concept, which means the actual running occupancy 2
) Requirements: The number of slots >= job parallelism (maximum parallelism of the operator), the job can run
Note: If it is yarn mode, dynamic application
--> Number of TMs applied = job parallelism / each The number of slots of a TM, rounded up
​ For example, session: 0 TaskManager at the beginning, 0 slots
​ --> Submit a job, parallelism 10
​ --> 10/3, round up, apply for 4 TMs,
​ --> Use 10 slots, remaining 2 slots

Parallelism refers to the number of tasks or operators that are executed simultaneously during job execution. It determines how many data shards or parallel operations a job can process simultaneously. The degree of parallelism can be set at the job level, operator level, or subtask level.

The task slot is the actual execution resource unit, which is used to execute tasks or operators in parallel. Each task slot can be viewed as an available execution thread or processing unit.

The relationship is as follows:

  1. Parallelism and number of task slots: The degree of parallelism is usually tied to the number of task slots. If the degree of parallelism is greater than the number of task slots, tasks will be assigned to multiple task slots for execution, thereby achieving parallel processing. If the parallelism is less than or equal to the number of task slots, then each task or operator can be executed in a separate task slot.
  2. Load balancing: The number of task slots can be used to achieve task load balancing. If the degree of parallelism is greater than the number of task slots, the system will try to evenly distribute tasks to the available task slots to achieve full resource utilization and load balance.
  3. Fault tolerance: The number of task slots is also related to fault tolerance. If a task or operator on a certain task slot fails, the system can reassign it to other available task slots, so as to achieve fault recovery and ensure the continuous execution of jobs.

To sum up, task slots are closely related to parallelism, and task slots provide actual execution resource units for parallel execution of tasks or operators. The degree of parallelism determines the number of tasks or operators that are executed simultaneously, and the number of task slots can affect the setting of the degree of parallelism, load balancing, and fault tolerance.

Guess you like

Origin blog.csdn.net/a772304419/article/details/132626521