Paper translation - STUN: Reinforcement-Learning-Based Optimization of Kernel Scheduler Parameters 3 (3)

Continued from the previous article: Translation and reading of the paper - STUN: Reinforcement-Learning-Based Optimization of KernelScheduler Parameters 3 (2)

Paper translation - STUN: Reinforcement-Learning-Based Optimization of KernelScheduler Parameters 3 (3)

3. Background (knowledge)

3.3 Variable optimization

This section presents the scheduler policy and parameters, which are variables optimized through reinforcement learning in STUN.

3.3.1 Scheduler strategy

In the Linux kernel, five scheduler strategies are currently defined: NORMAL (CFS), FIFO, RR, BATCH, and IDLE. STUN can use the tool schedtool provided by Linux to change these policies without restarting the server.

  • SCHED_NORMAL (CFS)

This is the default scheduler policy of the Linux kernel. The purpose of CFS is to maximize overall CPU utilization and provide a fair share of CPU resources to all tasks. CFS is based on a per-CPU run queue whose tasks are run in order of virtual runtime and sorted by a red-black tree.

  • SCHED_FIFO

As a fixed-priority scheduling strategy, each task is executed with a priority value from 1 to 99, which is a strategy that preempts the CPU and executes the CPU in the order of high priority.

  • SCHED_RR

This is basically the same operation as SCHED_FIFO. Although each task has a time quantum value that represents the maximum time to execute, when the time quantum expires, the tasks switch to the next task in a round-robin fashion.

  • SCHED_BATCH

This strategy is suitable for batch jobs. By avoiding being preempted by other tasks, we can run tasks longer than other strategies and make better use of hardware caches; however, this does not work well for interactive tasks.

  • SCHED_IDLE

This policy operates non-interactively, similar to SCHED_BACH. However, unlike SCHED_BATCH, SCHED_IDLE can be executed while other processes are idle.

3.3.2 Linux kernel scheduler parameters

The Linux kernel provides 14 scheduler parameters for optimization. In this study, we optimized 9 parameters and excluded 5 parameters that did not affect performance. Table 1 shows the range of parameter values ​​that can be changed and the default values ​​for the Linux kernel.

They can be changed without rebooting the machine using the command sysctl provided by Linux. The meaning of each parameter is as follows:

  • sched_latency_ns

Target preemption latency for CPU bound tasks. Increasing this parameter increases the time slice for CPU-bound tasks.

  • sched_migration_cost_ns

This is the amount of time after the last execution that a task is considered to be cached hot for migration decisions. Hot tasks are less likely to migrate to another CPU, so increasing this variable reduces task migration. When there are runnable processes, it is recommended to lower this value if the CPU idle time is higher than expected; it is better to increase it if the task is frequently switching between CPUs or nodes.

  • sched_min_granularity_ns

The minimum preemption granularity for CPU-bound tasks. This parameter is closely related to sched_latency_ns.

  • sched_nr_migrate

This parameter controls the amount of tasks that can be migrated across processors for load balancing purposes. When load balancing iterates the run queue with disabled interrupts (softirq), it can cause irq latency penalties for real-time tasks. Therefore, increasing this value may improve the performance of large SCHED_OTHER threads at the cost of increasing irq latency for real-time tasks.

  • sched_rr_timeslice_ms

This parameter can adjust the quantum (time slice) in the SCHED_RR strategy.

  • sched_rt_runtime_us

This is the amount allocated to real-time tasks during sched_rt_period_us. Setting the value to −1 will disable RT bandwidth enforcement. By default, real-time tasks may consume 95% of CPU resources per second, leaving only 5% or 0.05 seconds available for SCHED_OTHER tasks.

  • sched_rt_period_us

This parameter is the measurement period enforced by real-time task bandwidth.

  • sched_cfs_bandwidth_slice_us

When using CFS bandwidth control, this parameter controls the amount of runtime (bandwidth) transferred from the task's control group bandwidth pool to the run queue. Smaller values ​​allow global bandwidth to be shared between tasks in a fine-grained manner, while larger values ​​reduce transfer overhead.

  • sched_wakeup_granularity_ns wakeup

This is a wake preemption granularity. Increasing this variable can reduce wake-up preemption, reducing the interference of compute-bound tasks; lowering it can improve wake-up latency and throughput for latency-critical tasks, especially when short-duty-cycle load components must compete with CPU-bound components.

knowledge supplement

The explanation of the scheduling parameters in the above paper is too difficult to understand, so let me explain it in a more general way.

Directory: under /proc/sys/kernel.

  • sched_latency_ns

Explanation 1: sysctl_sched_latency, which means that all processes in a run queue run once.

Explanation 2: Set a scheduling cycle ( sched_latency_ns), the goal is to let each process have a chance to run at least once in this cycle. In other words, the longest time each process waits for the CPU does not exceed this scheduling cycle; The number of CPU usage in this scheduling cycle is equally divided by everyone. Since the priority of the process is nicedifferent, weighting is required when dividing the scheduling cycle.

  • sched_migration_cost

sysctl_sched_migration_cost, this variable is used to judge whether a process is still hot, if the running time of the process (now - p->se.exec_start) is less than it, then the kernel thinks its code is still in the cache, so the process is still hot, then in It is not considered when migrating

  • sched_min_granularity_ns

Explanation 1: sysctl_sched_min_granularity indicates the minimum running time of the process to prevent frequent switching. For interactive systems (such as desktops), this value can be set smaller, which ensures faster response to interactions.

Explanation 2: CFS sets the minimum time value of CPU occupied by a process. If the process running on the CPU is less than this time, it cannot be transferred from the CPU. If the number of processes is too large, it will cause the CPU time slice to be too small. If it is less than , the time slice sched_min_granularity_nswill be taken as sched_min_granularity_nsthe criterion ; and the scheduling cycle will no longer be followed sched_latency_ns, but sched_min_granularity_nsthe product of (*the number of processes) will prevail.

  • sched_nr_migrate

sysctl_sched_nr_migrate, when performing load balancing in the case of multiple CPUs, how many processes can be moved to another CPU at a time.

  • sched_rt_period_us /sched_rt_runtime_us

sysctl_sched_rt_period/sysctl_sched_rt_runtime, the two parameters together determine that the real-time process can run within the period of sysctl_sched_rt_period, and the maximum total time that the real-time process can run cannot exceed sysctl_sched_rt_runtime.
 

  • sched_wakeup_granularity_ns

sysctl_sched_wakeup_granularity, this variable indicates the base of at least the time that the process should run after being woken up. It is only used to judge whether a process should preempt the current process, and does not represent the minimum time it can execute (sysctl_sched_min_granularity). If the value is smaller, Then the probability of preemption will be higher.


 

Guess you like

Origin blog.csdn.net/phmatthaus/article/details/131430061