An article explaining Linux thread programming - thread principles, thread programming, etc., with rich examples

Table of contents

Linux threads

Introduction to each API prototype

Routines using these APIs


Linux threads

For detailed study, you can use a dictionary for reference:

Multi-threading means that multiple different threads can be run simultaneously in a single program/process (can be regarded as multiple tasks sharing the same memory resource) to perform different tasks:

  • Higher operating efficiency, parallel execution;

  • Multithreading is a modular programming model;

  • Compared with processes, the creation and switching overhead of threads is smaller;

  • Easy communication;

  • It can simplify the structure of the program and facilitate understanding and maintenance; higher resource utilization.

Multi-threaded application scenarios:

  1. If there are operations that need to be waited for in the program, such as network operations, file IO, etc., multi-threading can be used to fully utilize the processor resources without blocking the execution of other tasks in the program.

  2. When there are large decomposable tasks in the program, such as long-time computing tasks, multi-threads can be used to complete the tasks together and shorten the computing time.

  3. There are tasks in the program that need to be run in the background, such as some monitoring tasks and scheduled tasks, which can be completed by using multi-threads.

Thread safety and thread synchronization:

Thread safety : When multi-threaded access, a locking mechanism is adopted. When a thread accesses certain data of this class, it is protected and other threads cannot access it until the thread has finished reading, and then other threads can use it. There will be no data inconsistency or data pollution.

Thread-unsafe : Data access protection is not provided, and it is possible that multiple threads may change data successively, resulting in dirty data. If multiple threads read and write shared variables at the same time, data inconsistency will occur.

Thread safety issues are caused by global variables and static variables .

If there are only read operations for global variables and static variables in each thread, but no write operations, generally speaking, this global variable is thread-safe; if multiple threads perform write operations at the same time, thread synchronization generally needs to be considered, otherwise This may affect thread safety.

Thread synchronization : that is, when a thread is operating on the memory, no other thread can operate on the memory address. Until the thread completes the operation, other threads can operate on the memory address.

Thread asynchronous : When accessing a resource, it simultaneously accesses other resources while waiting idle to implement a multi-threading mechanism.

Synchronization: Thread A wants to request a resource, but this resource is being used by thread B. Because of the synchronization mechanism, thread A cannot request it. What should I do? Thread A can only wait.

Asynchronous: Thread A wants to request a resource, but this resource is being used by thread B. Because there is no synchronization mechanism, thread A can still request it, and thread A does not need to wait.

Advantages of thread synchronization :

Benefits: Solve the thread safety problem.

Disadvantages: There is a judgment lock every time, which reduces efficiency.

But between safety and efficiency, the first consideration is safety.

Multi-thread development in C under Linux system uses a thread library called pthread; kernel-level threads and user-level threads are distinguished/set by different parameters passed into the API when creating threads.

Because pthread is not the default library of the Linux system, but the POSIX standard thread library. It is used as a library in Linux, so the compile option needs to add -lpthread (or -pthread) to explicitly link the library. Example: gcc xxx.c -lpthread -o xxx.bin.

Introduction to each API prototype

  • pthread_self() - Get the thread ID.

    /* pthread_self()——Function to get thread ID
        #include <pthread.h>
        pthread_t pthread_self(void);
        Success: Return thread number
    */
    #include <pthread.h>
    #include <stdio.h>
    int main()
    {
        pthread_t tid = pthread_self();
        printf("tid = %lu\n",(unsigned long)tid);
        return 0;
    }
  • pthread_create()——Thread creation.

    /* pthread_create()——Thread creation
        #include <pthread.h>
        int pthread_create(pthread_t *thread, const pthread_attr_t *attr,void *(*start_routine) (void *), void *arg);
        The first parameter of this function is the pthread_t pointer, which is used to save the thread number of the new thread.
        The second parameter represents the properties of the thread, and NULL can be passed in to represent the default properties.
        The third parameter is a function pointer, which is the function executed by the thread. The return value of this function is void*, and the formal parameters are void*.
        The fourth parameter is represented as the parameter passed to the thread processing function. If it is not passed in, it can be filled with NULL.
        Returns 0 for success, and a negative value for failure.
    */
    #include <pthread.h>
    #include <stdio.h>
    #include <unistd.h>
    #include <errno.h>
    void *fun(void *arg) {    printf("pthread_New = %lu\n",(unsigned long)pthread_self());    }
    int main()
    {
        pthread_t tid1;
        int ret = pthread_create(&tid1,NULL,fun,NULL);
        ... Simplified, error handling omitted
        
        /* tid_main is the thread ID obtained through pthread_self, and tid_new is the space pointed by tid after successfully executing pthread_create*/
        /* That is, the print results of tid1 and pthread_New should be consistent */
        printf("tid_main = %lu tid_new = %lu \n",(unsigned long)pthread_self(),(unsigned long)tid1);
        /* Due to the random order of thread execution, not adding sleep may cause the main thread to execute first, resulting in the end of the process and the inability to execute the child thread*/
        /* In other words, if the main thread executes here without adding sleep, then the return will end directly, then the thread fun will end before executing this process*/
        sleep(1);    return 0;
    }
    /*
    Threads can indeed be created through pthread_create. The tid after executing pthread_create in the main thread points to the thread number space, which is consistent with the thread number printed by the child thread through the function pthread_self.
    In particular, when the main thread accompanying the process ends, the created thread will also end immediately and will not continue to execute. Moreover, the execution order of the created threads is randomly competitive, and there is no guarantee which thread will run first. You can comment the sleep function in the above code and observe the experimental phenomenon.
    */

    Pass in the parameters when creating the process:

    #include <pthread.h>
    #include <stdio.h>
    #include <unistd.h>
    #include <errno.h>
    void *fun1(void *arg){    printf("%s:arg = %d Addr = %p\n",__FUNCTION__,*(int *)arg,arg);    }
    void *fun2(void *arg){    printf("%s:arg = %d Addr = %p\n",__FUNCTION__,(int)(long)arg,arg);    }
    int main()
    {
        pthread_t tid1,tid2;    int a = 50;
        int ret = pthread_create(&tid1,NULL,fun1,(void *)&a); /* Pass in address*/
        ... Simplified, error handling omitted
        ret = pthread_create(&tid2,NULL,fun2,(void *)(long)a); /* Pass in value*/
        ...simplify
        sleep(1);    printf("%s:a = %d Add = %p \n",__FUNCTION__,a,&a);    return 0;
    }
  • pthread_exit / pthread_cancel and pthread_join / pthread_tryjoin_np - Exit of the thread.

    There are three exit situations for threads:

    • The first is that the process ends, and all threads in the process will also end.

    • The second is to actively exit the thread through the function pthread_exit().

    • The third type is called pthread_cancel() by other threads to passively exit.

    Regarding resource recycling after thread exit:

    Multiple threads in a process share data segments. If a thread is joinable or non-detached, after the thread exits, the resources occupied by the exiting thread will not be released with the termination of the thread. You must use the pthread_join/pthread_tryjoin_np function to synchronize and release the resources, that is, when the thread After the end, the main thread needs to use the function pthread_join/pthread_tryjoin_np to recycle the thread's resources and obtain the data that needs to be returned after the thread ends. If a thread is unjoinable or in a detached state, it will actively recycle resources after the thread exits. There is no need to call pthread_join/pthread_tryjoin_np in the main thread to recycle the thread's resources. Of course, when the thread exits, it cannot pass Out parameters. Joinable and unjoinable can be set, which will be discussed in the thread attributes section later.

    Regarding the exit of the main thread/process:

    • In the main thread, if you return in the main function or call the exit() function, the main thread will exit and the entire process will terminate. At this time, all threads in the process will also terminate, so avoid premature main function Finish.

    • Calling the exit() function in any thread will cause the process to end. Once the process ends, all threads in the process will end.

    The following is a description of the exit-related APIs of pthread_exit / pthread_cancel and pthread_join / pthread_tryjoin_np threads.

    /*
    The thread actively exits pthread_exit
        #include <pthread.h>
        void pthread_exit(void *retval);
        The pthread_exit function is a thread exit function. When exiting, it can pass a void* type of data to the main thread. If you choose not to pass out data, you can fill the parameter with NULL.
        The only parameter value_ptr of the pthread_exit function is the return value of the function. As long as the second parameter value_ptr in pthread_join is not NULL, this value will be passed to value_ptr.
    The thread passively exits pthread_cancel, and other threads use this function to let another thread exit.
        #include <pthread.h>
        int pthread_cancel(pthread_t thread);
        Success: return 0
        This function passes in a tid number and will force the thread pointed to by the tid to exit. If executed successfully, it will return 0.
    Thread resource recycling (blocking at the point where pthread_join is executed, and then waiting for the exit of the thread thread)
        #include <pthread.h>
        int pthread_join(pthread_t thread, void **retval);
        This function is a thread recycling function. The default state is blocking state and will not return until the thread is successfully recycled. The first parameter is the tid number of the thread to be recycled, and the second parameter is the data received from the thread after the thread is recycled, or the thread is canceled and returns PTHREAD_CANCELED.
        
    Thread resource recycling (non-blocking, loop query required)
        #define _GNU_SOURCE            
        #include <pthread.h>
        int pthread_tryjoin_np(pthread_t thread, void **retval);
        This function is a non-blocking mode recycling function. It uses the return value to determine whether the thread is recycled. If the recycling is successful, it returns 0. The remaining parameters are consistent with pthread_join.
    The difference in usage between blocking mode pthread_join and non-blocking mode pthread_tryjoin_np:
        Recycling threads in blocking mode through the function pthread_join almost stipulates the order of thread recycling. If the first thread to be recycled does not exit, it will always be blocked, resulting in the subsequent threads that exit first being unable to be recycled in time.
        Using the non-blocking recycling thread through the function pthread_tryjoin_np, resources can be freely recycled according to the exit sequence.
    */
  • Thread properties related:

    Refer to pthread_attr_init thread attribute high driver's blog - CSDN blog pthread_attr_destroy , thread attribute detailed introduction to thread attribute pthread_attr_t - Robin Hu's column - CSDN blog .

    /* Define pthread_attr_t thread attribute variable, used to set thread attributes, mainly including scope attribute (used to distinguish user mode or kernel mode), detach (detachable/joinable) attribute, stack address, stack size, priority*/
    pthread_attr_t attr_1,attr_2_3_4[3];
    /* First call pthread_attr_init to default initialize the thread attribute variables, and then call the pthread_attr_xxx class function to change its value*/
    pthread_attr_init(&attr_1);
    /* For example (here is an example, there are many APIs for setting properties)
        pthread_attr_setdetachstate(&attr_1,PTHREAD_CREATE_DETACHED); to set the detachable attribute of the thread (the pthread_detach function also sets this attribute of a thread)
        By default, threads are joinable or non-detached. In this case, after the main thread waits for the child thread to exit, only when the pthread_join() function returns, the created thread is terminated and the system resources it occupies can be released. . If the thread is set to unjoinable or detached state, that is, it actively reclaims resources after the child thread exits, the main thread does not need to call pthread_join to wait for the child thread to exit.
        You can use the pthread_attr_setdetachstate function to set the thread attribute detachstate to one of the following two legal values: set to PTHREAD_CREATE_DETACHED to start the thread in a detached state (unjoinable); or set to PTHREAD_CREATE_JOINABLE to start the thread normally (joinable, the default).
    You can use the pthread_attr_getdetachstate function to obtain the current datachstate thread attribute.
        In addition, it is generally not recommended to actively change the priority of a thread.
        The reference link related to thread attributes above introduces more attribute setting APIs, including inheritance, scheduling strategy (two optional + other methods), and scheduling parameters.
    */
    /* Here is whether to set the thread attribute to a system-level thread or a user-level thread*/
    pthread_attr_setscope(&attr_1, PTHREAD_SCOPE_SYSTEM); /* System-level thread, suitable for intensive calculations*/
    pthread_attr_setscope(&attr_2_3_4[0], PTHREAD_SCOPE_PROCESS); /* User-level thread, suitable for IO intensive*/
    /* Then use this attribute to create a thread*/
     pthread_create(&tid, &attr_1, fn, arg);
    /* You can set all attributes to NULL values, re-init and then set */
    pthread_attr_destroy(&attr_1);

    Thread competition: Reference Linux thread implementation & LinuxThread vs. NPTL & user-level kernel-level threads & threads and signal processing - blcblc - Blog Park (cnblogs.com) , (227 messages) Linux process analysis_deep_explore's blog-CSDN Blog .

    System-level threads will compete with other processes for time slices, and users and threads will only compete with other users and threads within the process for scheduling.

    pthreads after Linux 2.6 are implemented using NPTL (better support for POSIX). They are all system-level 1:1 threads (one thread is equivalent to one process, and 1:n is equivalent to n threads competing with each other in a process) model. It is a system-level thread. When calling clone() in pthread_create(), CLONE_VM is set, so from the perspective of the kernel, two processes with the same memory space are generated. Therefore, when the user state creates a new thread, the kernel state generates a new process.

    Therefore, Linux is a multi-tasking, multi-threaded operating system. But the threading mechanism it implements is very unique. From a kernel perspective, it does not have the concept of threads. Linux implements all threads as processes, and a thread is simply considered a process that shares certain resources with other processes. Processes and threads have their own task_struct, and there is no difference between the two in the kernel's eyes.

  • etc.

Routines using these APIs

/* File: Examples of basic thread API\routines of thread API-passive recycling under Linux.c */
#define _GNU_SOURCE 
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
/* Example description:
Create system-level thread 1, no incoming and outgoing parameters, infinite loop, exit under certain conditions, use pthread_join to recycle
Create three user-level threads 2, 3, and 4. Use an array for the thread number, pass in and out parameters, and use pthread_tryjoin_np to recycle
Compile under the gnu compiler in the Linux environment: gcc temp.c -lpthread -o temp.bin
*/
/* Used to set thread attributes, mainly including scope attribute (used to distinguish user mode or kernel mode), detach (detachable/joinable) attribute, stack address, stack size, priority*/
pthread_attr_t attr_1, attr_2_3_4[3];
/* Pointer to the thread identifier, which can be called the thread number to distinguish threads. It is only valid in this process. The essence is unsigned long int */
pthread_t id_1, id_2_3_4[3];
/* Thread 1, no incoming and outgoing parameters, exit after execution, use pthread_join to recycle*/
void *thread_1(void *in_arg)
{
    int i = 0;
    printf("thread_1 ID = %lu\n", (unsigned long)pthread_self());
    for(;;)
    {
        printf("thread_1 print times = %d\n", ++i);
        if(i >= 3)
            pthread_exit(NULL);
            /* Use pthread_exit() to call the return value of the thread to exit the thread, but the resources occupied by the exiting thread will not be released with the termination of the thread*/
        sleep(1); /* sleep() unit is seconds, the program hangs for 1 second*/
    }
}
/* Thread 2 3 4 */
void *thread_2_3_4(void *in_arg)
{
    /* Must be statically modified, otherwise pthread_join/pthread_tryjoin_np cannot obtain the correct value*/
    static char* exit_arg;
    
    /* exit_arg is a local variable of this function. Multiple threads 2, 3, and 4 will modify it, so when it finally returns, we don’t know who modified it last*/
    /* Therefore please pay special attention*/
    exit_arg = (char*)in_arg;
    
    pthread_t self_id = pthread_self();
    if(self_id == id_2_3_4[0])
    {
        printf("thread_2 ID = %lu\n", (unsigned long)self_id);
        sprintf((char*)in_arg,"id_2 gagaga");
    }else if(self_id == id_2_3_4[1])
    {
        printf("thread_3 ID = %lu\n", (unsigned long)self_id);
        sprintf((char*)in_arg,"id_3 lalala");
    }else if(self_id == id_2_3_4[2])
    {
        printf("thread_4 ID = %lu\n", (unsigned long)self_id);
        sprintf((char*)in_arg,"id_4 hahaha");
    }else
    {
        pthread_exit(NULL);
    }
    pthread_exit((void*)in_arg);
}
int main(void)
{
    int ret = -1, i = 0, return_thread_num = 0;
    char *str_gru[3];
    void *exit_arg = NULL;
    pthread_attr_init(&attr_1);
    pthread_attr_setscope(&attr_1, PTHREAD_SCOPE_SYSTEM); /* System-level thread*/
    
    for(i = 0;i < 3;i++)
    {
        pthread_attr_init(&attr_2_3_4[i]);
        pthread_attr_setscope(&attr_2_3_4[i], PTHREAD_SCOPE_PROCESS); /* User-level thread*/
    }
    /* Create thread 1 */
    ret = pthread_create(&id_1, &attr_1, thread_1, NULL);
    if(ret != 0)
    {
        /* perror outputs a descriptive error message to standard error stderr. When an error occurs when calling "some" functions, the function has reset the value of errno. The perror function just outputs some of the information you input and the error corresponding to errno*/
        perror("pthread1, pthread_create: ");
        return -1;
    }
    
    /* Create threads 2, 3, 4 */
    for(i = 0;i < 3;i++)
    {
        str_gru[i] = (char*)malloc(sizeof(char) * 42 + i);
        ret = pthread_create(&id_2_3_4[i], &attr_2_3_4[i], thread_2_3_4, (void *)str_gru[i]);
        if(ret != 0)
        {
            perror("pthread 2 3 4, pthread_create: ");
            return -1;
        }
    }
    
    /* Wait for all threads to end, first wait for threads 2, 3, and 4 to exit one after another without sequential requirements, and then wait for thread 1 to exit */
    for(;;)
    {
        for(i = 0;i < 3;i++)
        {
            /* The np of pthread_tryjoin_np is not portable and is a non-POSIX standard API specified by gnu. It can only be used by compilers in Linux*/
            if(pthread_tryjoin_np(id_2_3_4[i], &exit_arg) == 0)
            {
                printf("pthread : %lu exit with str: %s\n", (unsigned long)id_2_3_4[i], (char*)exit_arg);
                free(str_gru[i]);
                return_thread_num++;
            }
        }
        if(return_thread_num >= 3) break;
    }
    pthread_join(id_1, NULL);
    return 0;
}

Guess you like

Origin blog.csdn.net/Staokgo/article/details/132630744