Introduction to common functions of linux (2)


find_next_zero_bit

find_next_zero_bitis a function provided in the Linux kernel to find the next bit that is 0 in a given bitmap.

The function prototype is as follows:

unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size, unsigned long offset);
  • addrIs a pointer to an unsigned long integer array representing the bitmap to be found.
  • sizeis the size of the bitmap, in bits.
  • offsetIs the starting position, indicating which bit to start looking for the next 0 bit.

The function offsetstarts at position, finds the next bit in the bitmap that is 0, and returns the position of that bit. If a bit that is 0 is found, the index of that bit is returned; if no bit that is 0 is found, a sizevalue greater than or equal to is returned.

find_next_zero_bitFunctions are widely used in bitmap operations in the kernel, such as memory management, device drivers and other scenarios. It provides an efficient way to find the next 0 bit in the bitmap in order to perform corresponding operations.

In summary, find_next_zero_bitit is a function used in the Linux kernel to find the next 0 bit in the bitmap, and is often used in bitmap operations. It provides an efficient way to locate the next 0 bit in the bitmap so that it can be processed accordingly.

Query *addr, starting from the offset bit, the number of digits of the first bit that is not 0 (the lowest bit starts from 0)

Note: The minimum value of offset is 0, and the maximum value is sizeof(unsigned long)*8 - 1
0 ~ 255

virt_to_phys

virt_to_physIs a macro definition used in the Linux kernel to convert virtual addresses to physical addresses.

On 32-bit systems, virt_to_physis defined as follows:

#define virt_to_phys(vaddr)	__pa(vaddr)

On 64-bit systems, virt_to_physthe definition of is as follows:

#define virt_to_phys(vaddr)	__phys_addr(virt_to_phys_pa(vaddr))

The function of this macro is to convert the given virtual address vaddrto the corresponding physical address.

On 32-bit systems, virt_to_physmacros use __pafunctions to do the conversion. __paThe function subtracts the offset of the kernel from the virtual address to obtain the physical address.

In a 64-bit system, virt_to_physthe macro first virt_to_phys_paconverts the virtual address to the page frame number of the physical address through a function, and then __phys_addrconverts the page frame number into a physical address through a function.

These macros are defined in the kernel to handle translation between virtual addresses and physical addresses. In some scenarios that require direct manipulation of physical addresses, virt_to_physmacros can be used to convert virtual addresses into physical addresses for corresponding processing.

In summary, virt_to_physit is a macro definition used in the Linux kernel to convert virtual addresses into physical addresses. It maps virtual addresses to corresponding physical addresses through a series of operations in order to perform direct physical address operations.


NUMA

NUMA (Non-Uniform Memory Access) is a computer system architecture in which multiple processors (CPUs) and memory modules are connected together through a high-speed Internet network. In a NUMA system, each processor has its own local memory (local node) and can access the remote memory of other processors (remote nodes). Since accessing local memory is faster than accessing remote memory, the performance of a NUMA system depends on how memory is managed and allocated.

When making trade-offs and optimal settings for NUMA systems, here are some important considerations:

  1. Task distribution: Assigning tasks to the nearest local node minimizes memory access latency. Therefore, task scheduling and distribution strategies need to be considered so that tasks can be executed on local nodes as much as possible.

  2. Memory allocation strategy: In NUMA systems, memory allocation should consider allocating data to local nodes to reduce remote memory access. This can be achieved using the NUMA-aware memory allocation functions or libraries provided by the operating system.

  3. Data locality: Optimize algorithms and data structures to take advantage of data locality and reduce remote memory accesses. For example, organize data closely on the same node to reduce data transfer across nodes.

  4. Affinity settings: In some cases, you can ensure that a task executes on the local node by setting its affinity with the local node. This can be accomplished through tools or programming interfaces provided by the operating system.

  5. NUMA-aware scheduler: Some operating systems provide a NUMA-aware scheduler that can schedule tasks according to the characteristics of the NUMA system. Using these schedulers allows better utilization of local memory and reduces remote memory accesses.

  6. Cache coherence: Cache coherence is an important issue in multiprocessor systems. In NUMA systems, the design and configuration of cache coherence protocols need to be considered to ensure data consistency and performance.

In short, the trade-offs and optimization settings of NUMA systems involve aspects such as task distribution, memory allocation strategy, data locality, affinity settings, scheduler selection, and cache consistency. With proper settings and optimization, the performance benefits of NUMA systems can be maximized and the impact of remote memory accesses reduced. Specific settings and optimization strategies may vary depending on system architecture and application requirements, and need to be adjusted and optimized based on specific circumstances.

Sample code

The following is a simple C code example that demonstrates how to use NUMA-aware functions to allocate memory and set the affinity of tasks in a Linux environment:

#include <stdio.h>
#include <stdlib.h>
#include <numa.h>
#include <numaif.h>
#include <sched.h>

int main() {
    
    
    // 初始化NUMA库
    numa_set_strict(1);
    if (numa_available() < 0) {
    
    
        printf("NUMA is not available on this system.\n");
        return 1;
    }

    // 分配NUMA感知的内存
    size_t size = 1024 * 1024 * 100;  // 100MB
    void* memory = numa_alloc_local(size);

    // 获取当前CPU的数量
    int num_cpus = numa_num_configured_cpus();

    // 设置任务的亲和性到本地节点
    cpu_set_t cpuset;
    CPU_ZERO(&cpuset);
    for (int cpu = 0; cpu < num_cpus; cpu++) {
    
    
        if (numa_bitmask_isbitset(numa_node_cpus_ptr(0), cpu)) {
    
    
            CPU_SET(cpu, &cpuset);
        }
    }
    if (sched_setaffinity(0, sizeof(cpu_set_t), &cpuset) != 0) {
    
    
        printf("Failed to set CPU affinity.\n");
        return 1;
    }

    // 执行任务,访问本地内存
    // ...

    // 释放内存
    numa_free(memory, size);

    return 0;
}

In the above code example, a function is used numa_alloc_localto allocate NUMA-aware local memory and numa_node_cpus_ptra function is used to obtain the CPU bitmap of the local node. Then, sched_setaffinityset the task's affinity to the local node's CPU set through a function.

Please note that the code examples are for demonstration purposes only, actual NUMA optimization requires more detailed setup and adjustments based on system architecture and application requirements. Make sure to check if your system supports NUMA before using NUMA related functions and link against the NUMA library when compiling (using -lnumaoptions).

In addition, different operating systems and programming languages ​​may provide different NUMA-related functions and interfaces, so the specific code implementation may vary. It is recommended to consult the relevant documentation of the operating system and programming language for more detailed information and sample code.

Atomic operations

In the Linux kernel, atomic operations are a mechanism used to ensure the atomicity of operations between multiple threads or processes. Atomic operations are uninterruptible, either completely executed or not executed at all, and there will be no partial execution. The Linux kernel provides a variety of functions and macros for atomic operations. Commonly used ones include the following:

  1. atomic_tType: atomic_tis an integer type used to implement atomic operations. You can use atomic_setthe function to set atomic_tthe initial value of the variable and use atomic_readthe function to get atomic_tthe current value of the variable.

  2. atomic_addand atomic_sub: These two functions are used to atomic_tperform atomic addition and subtraction operations on variables. For example, atomically increment by 5 atomic_add(5, &my_atomic_var).my_atomic_var

  3. atomic_incand atomic_dec: These two functions are used to atomic_tperform atomic increment and decrement operations on variables. For example, atomically increment by 1 atomic_inc(&my_atomic_var).my_atomic_var

  4. atomic_cmpxchg: This function is used for compare and exchange operations. It accepts three parameters: atomic_ta pointer to a variable, the desired old value, and the new value to be set. If the old value atomic_tis equal to the current value of the variable, the new value is set to atomic_tthe variable and the old value is returned; otherwise, no operation is performed and the current value is returned.

  5. atomic_xchg: This function is used for swap operation. It accepts two parameters: atomic_ta pointer to a variable and the new value to be set. Sets the new value into atomic_tthe variable and returns the old value.

These functions and macros provide a way to perform atomic operations in a multi-threaded or multi-process environment to ensure data consistency and correctness. When writing concurrent code, use atomic operations to avoid race conditions and data conflicts.

Please note that the above only lists some commonly used atomic operation functions and macros. The Linux kernel provides more atomic operation functions and mechanisms. You can choose the appropriate functions and macros to use according to specific needs. It is recommended to consult the relevant documentation and source code of the Linux kernel for more detailed information and example usage.

Guess you like

Origin blog.csdn.net/qq_44710568/article/details/131944075