An article introducing Linux EAS

Energy Aware Scheduling (EAS for short) is the basic function of the Linux thread scheduler in Android mobile phones, which enables the scheduler to predict the impact of its decisions on CPU energy consumption. Relying on the CPU's energy model (Energy Model, EM for short), EAS can select a CPU that can save the most energy for each thread, and minimize the impact on system performance.

EAS only runs on heterogeneous CPU topologies (such as Arm big.LITTLE), because this is the CPU topology with the greatest potential for EAS to save energy.

Note: This article analyzes and sorts out the open source code based on OPPO Reno9 Pro+ https://github.com/oppo-source/android_kernel_oppo_sm8475

1. Key concepts

1.1capacity

Computing power (capacity) is a basic concept in CPU scheduling. It reflects the computing power of a CPU and is a standardized value. It can be read by reading the file node of an Android phone

/sys/devices/system/cpu/cpu*/cpu_capacity gets the maximum computing power of each CPU.

The maximum computing power of the CPU = capacity-dmips-mhz * cpuinfo_max_freq / 1000.

Among them, "capacity-dmips-mhz" indicates how many dmips the CPU can execute when running at a frequency of 1mHz, which can be obtained from the device tree file of the processor;

"cpuinfo_max_freq" indicates the maximum frequency supported by the CPU in kHz, so the above formula is divided by 1000 to convert the calculation unit from kHz to mHz.

In order to facilitate the comparison and calculation of computing power, the maximum computing power of the CPU with the strongest computing power among the processors is normalized to 1024.

In a processor with a linear relationship between CPU computing power and frequency: CPU computing power at a certain frequency point =

(The frequency of a certain frequency point of the CPU/the maximum frequency of the CPU) * The maximum computing power of the CPU.

1.2 opp

Operating Performance Point (OPP), which indicates the voltage frequency pair (voltage/frequency tuple) supported by each CPU. Each operating frequency point of the CPU has a corresponding voltage. Frequency is positively correlated with voltage, the higher the frequency, the greater the voltage required.

1.3 power

After clarifying the computing power of a certain frequency point of the CPU, let’s look at the power of a certain frequency point of the CPU. The Energy Model module of the CPU provides related file nodes, which can be used to read the power of a certain frequency point of the CPU.

Read the file node /sys/kernel/debug/energy_model/pd0/*/power, you can get the power (mW) of each frequency point of the small core cluster CPU

The Energy Model code uses the following formula to calculate the power of each frequency point of the CPU:

P = C * V^2 * f, where C is the capacitance of the CPU (can be obtained by reading "dynamic-power-coefficient" from the processor's device tree file), V and f are the voltage and frequency of an OPP.

1.4 Energy Efficiency Ratio

The lower the power/capacity value corresponding to each frequency point of the CPU, the better the energy efficiency ratio. For the same CPU, the energy efficiency ratio of low frequency is better than that of high frequency. On the whole, the energy efficiency ratio of small-core cluster CPUs is better than that of large-core cluster CPUs, and the energy efficiency ratio of large-core cluster CPUs is better than that of super-large-core cluster CPUs; The energy efficiency ratio of the low-frequency band of the core cluster CPU, and the energy efficiency ratio of the high-frequency band of the large-core cluster CPU are worse than the energy efficiency ratio of the low-frequency band of the ultra-large core cluster CPU.

From the energy efficiency ratio curve in the above figure, we can clearly see the following characteristics:

  1. At the same computing power of 200 util, small-core cluster CPUs consume more power than large-core cluster CPUs. Therefore, when the system load is not heavy, threads can be run at the low frequency band of large-core CPUs, so as not to let small-core clusters The frequency of the CPU runs at a high frequency range to achieve the purpose of saving power without affecting system performance.
  2. The energy efficiency ratio of super-large-core cluster CPUs is much worse than that of large-core cluster CPUs. If super-large-core CPUs are not needed, try not to use them.

 Information through train: Linux kernel source code technology learning route + video tutorial kernel source code

Learning through train: Linux kernel source code memory tuning file system process management device driver/network protocol stack

2. Energy-aware thread core selection

EAS replaces the thread wake-up balancing code of CFS (task wake-up balancing code

), use the CPU Energy Model of the CPU and the CPU and thread load information collected by PELT/WALT to select a most power-saving CPU for the wake-up thread to run.

The code flow for EAS to select the running CPU for a thread is as follows:

2.1find_energy_efficient_cpu

find_energy_efficient_cpu() finds the most energy-efficient target CPU for the wakeup task. Find the CPU with the most idle computing power in each performance domain and use it as a potential candidate CPU for the thread to run on. Then, use the Energy Model to determine which CPU candidate is the most energy efficient.

A performance domain generally corresponds to a CPU cluster. If a thread is scheduled to run on the CPU with the largest idle computing power in the performance domain, it can ensure that the CPU of the cluster can run at the minimum required frequency.

Because the performance cost of thread migration is relatively high (such as cache failure), only when the selected most energy-efficient CPU saves more than 6% energy than the CPU currently running on the thread, the thread will migrate to the CPU to run.

The figure below lists the core code in find_energy_efficient_cpu(), and provides detailed comments on the code.

2.2compute_energy

compute_energy() estimates the energy consumption of performance domain pd when thread p migrates to dst_cpu to run. compute_energy() estimates the max of the cpu with the largest util in pd after the migration of thread p

_util and sum_util of all cpu utils, and call the API em_cpu_energy() provided by the Energy Model to calculate the energy consumption when the thread migrates to the performance domain pd.

The figure below lists the code of compute_energy (), and provides detailed comments on the code.

2.3em_cpu_energy

em_cpu_energy() is an api provided by Energy Model to estimate the sum of the energy consumption of all CPUs in the performance domain. It has 4 parameters, @pd needs to estimate the performance domain of energy consumption; the utilization rate of the CPU with the highest utilization rate in the @max_util performance domain, which determines the operating frequency of the CPU in the entire performance domain; the utilization of all CPUs in the @sum_util performance domain The sum of rates is used to estimate the energy consumption of the entire performance domain; @allowed_cpu_cap The maximum computing power of the CPU allowed by the performance domain (may be smaller than the original value due to thermal limitations).

The em_cpu_energy() operation process is as follows:

  1. According to the utilization rate max_util of the CPU with the highest utilization rate in the performance domain, the minimum operating frequency required by the CPU in the performance domain is estimated. There are two points to note here. The utilization rate used to estimate the frequency is 1.25 times max_util, and the expected CPU frequency adjustment Governor is Schedutil or A similar CPU's frequency follows its utilization from the Governor.
  2. Find the lowest performance state ps that meets the frequency requirement in the CPU energy model.
  3. Estimate the energy consumption of the entire performance domain based on the sum_util of the utilization rates of all CPUs in the performance domain, the computing power of the CPU, and the cost variable in the performance status ps. Calculation formula:

ps->cost * sum_util / cpu computing power , where ps->cost = ps->power * cpu maximum frequency / ps->frequency, its value has been calculated when the energy model initializes each performance state of the CPU.

The figure below lists the code of em_cpu_energy (), and provides detailed comments on the code.

3. EAS and load balancing

From a general perspective, the scenarios where EAS can help the most are those with light to moderate CPU utilization. When heavy CPU-bound tasks are running, they require as much CPU power as possible, and it is difficult for EAS to save energy without seriously impairing performance. In order to prevent EAS from affecting performance, once the utilization of a certain CPU exceeds 80% of its computing power, the entire root domain is marked as 'overutilized' and EAS is disabled. When the utilization rate of all CPUs in the root domain is less than 80% of its computing power, load balancing is disabled, and EAS overrides the wake-up load balancing code. When it does not affect system performance, EAS will choose the most power-saving CPU to run. Therefore, load balancing is disabled to prevent it from breaking EAS core selection rules. It is safe to do so when the system is not overutilized. Because below the 80% cutoff point means:

  1. All CPUs have idle time, so the utilization signal used by EAS can accurately represent the "size" of various tasks in the system;
  2. All tasks are given enough CPU power, regardless of their nice value;
  3. Because there is idle CPU computing power, all tasks can meet regular blocking/sleeping, and sufficient load balancing is done when waking up.

Once the computing power of a certain CPU exceeds the critical point of 80%, at least one of the above three assumptions is incorrect. In this case, the overutilized flag of the entire root domain is set to true, EAS is disabled, and load balancing is re-enabled.

Since the concept of overutilization relies heavily on detecting idle time in the system, CPU power "stolen" by higher (than CFS) scheduling classes (and thus IRQs) must be considered. Therefore, the detection of overutilization includes not only the CPU computing power used by CFS tasks, but also the CPU computing power used by other scheduling classes and IRQs.

Four. Summary

EAS is only enabled when the system load is not heavy, that is, when the utilization rate of each CPU in the system is lower than 80% of its computing power, and the most energy-efficient CPU selected can only save energy greater than 6 compared with the CPU currently running on the thread. %, the thread will migrate to the CPU to run. Therefore, the prerequisite for EAS to select the most energy-efficient CPU for threads to run is very strict. For heavy-load scenarios (such as games), the functions of EAS should be rarely used. For power consumption optimization of heavy-load scenarios, it is possible here It is a point worth trying.

 

Guess you like

Origin blog.csdn.net/youzhangjing_/article/details/130489241
EAS