[Switch] Linux Cpu query

1. Under Linux, how to see the usage of each CPU:

#top -d 1

(At this time, the system load display will be refreshed at a frequency of 1s, and you can see the total CPU load, as well as information such as the process id and process name that account for the highest CPU)

(Switch press number 1, you can switch between displaying multiple CPUs and total CPUs)

Then press the number 1. Multiple CPUs are displayed (the same is true for pressing 1 after top)

Cpu0: 1.0%us, 3.0%sy, 0.0%ni, 96.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Cpu1: 0.0%us, 0.0%sy, 0.0%ni, 100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

 Here are the descriptions for us, sy, ni, id, wa, hi, si, and st:

The us column shows the percentage of CPU time spent in user mode. When the value of us is relatively high, it means that the user process consumes a lot of CPU time, but if it is greater than 50% for a long time, it is necessary to consider optimizing the user program.

The sy column shows the percentage of cpu time spent by the kernel process. Here, the reference value of us + sy is 80%. If us + sy is greater than 80%, it means that there may be insufficient CPU.

The ni column shows the percentage of CPU occupied by processes in user process space that have changed priorities.

The id column shows the percentage of time the cpu is idle.

The wa column shows the percentage of CPU time occupied by IO waits. The reference value of wa here is 30%. If wa exceeds 30%, it means that the IO wait is serious. This may be caused by a large number of random accesses to the disk, or the bandwidth bottleneck of the disk or disk access controller (mainly block operations).

hi

and

st

2. Under Linux, how to confirm multi-core or multi-CPU:

#cat /proc/cpuinfo

Multi-core or multi-CPU if there are multiple items like:

processor       : 0

......

processor       : 1

3. How to see which CPU a process is running on:

#top -d 1

Then press f. to enter the top Current Fields setting page:

Checked: j: P = Last used cpu (SMP)

The full name of SMP is "Symmetrical Multi-Processing" (Symmetrical Multi-Processing) technology.

Then one more item: P shows which CPU is used by this process.

Sam found through experiments that the same process will use different CPU Cores at different times. This should be handled by Linux Kernel SMP.

4. Configure the Linux Kernel to support multiple Cores:

The CONFIG_SMP option must be enabled during kernel configuration for the kernel to be SMP aware.

Processor type and features  ---> Symmetric multi-processing support

Check whether the current Linux Kernel supports (or uses) SMP

#uname -a

5. SMP load balancing for Kernel 2.6:

When tasks are created in an SMP system, these tasks are placed on a given CPU run queue. Generally speaking, we have no way of knowing when a task is short-lived and when it needs to be long-running. Therefore, the initial assignment of tasks to CPUs may not be ideal.

To maintain task load balance among CPUs, tasks can be redistributed: tasks are moved from heavily loaded CPUs to lightly loaded CPUs. The Linux 2.6 scheduler provides this functionality using load balancing. Every 200ms, the processor will check whether the load of the CPU is unbalanced; if it is unbalanced, the processor will perform a task balancing operation among the CPUs.

A side effect of this process is that the new CPU's cache is cold for the migrated tasks (requires reading data into the cache).

Remember that the CPU cache is a local (on-chip) memory that provides faster access than system memory. If a task is executed on a certain CPU, the data related to this task will be put into the local cache of this CPU, which is called hot. If there is no data in the CPU's local cache for a task, then the cache is called cold.

Unfortunately, keeping the CPU busy results in a situation where the CPU cache is cold for migrated tasks.

6. How the application utilizes multiple Cores:

Developers can write parallelizable code into threads, which are scheduled to run concurrently by the SMP operating system.

Also, Sam envisages, for code that must be executed sequentially. It can be divided into multiple nodes, each node is a thread. And channels are placed between nodes. The nodes are like pipelines. This also greatly enhances CPU utilization.

E.g:

The game can be divided into 3 nodes.

1. Accept external information, claim data (1ms)

2. Using data, physical operations (3ms)

3. Display the results of physical operations. (2ms)

If linear programming, the whole process takes 6ms.

But if you treat each node as a thread. But the threads are executed synchronously. Then the whole process only takes 3ms.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326483923&siteId=291194637