Linux multi-threaded programming learning experience

  Table of contents



Preface

Why have threads?
        If you are the owner of a factory now, there is a production line in the factory. Now the supply exceeds the demand, and the scale of production must be expanded. From a process perspective, I just create another factory and copy the previous production lines; but what if I directly add a few production lines to the original factory? So should the scale of this expansion be much smaller? This method is the thread method.




1. Introduction to Linux threads

1. Processes and threads

        A typical UNIX/Linux process can be viewed as having only one thread of control: a process only does one thing at the same time. With multiple control threads, the process can be designed to do more than one thing at the same time during program design, and each thread handles independent tasks. A process is an instance of a program during execution and is the basic unit responsible for allocating system resources (CPU time, memory, etc.). In a system designed for threads, the process itself is not the basic running unit, but a container for threads. The program itself is just a description of instructions, data and their organization, and the process is the real running instance of the program (those instructions and data).

        A thread is the smallest unit that the operating system can perform operation scheduling. It is included in the process and is the actual operating unit in the process. A thread refers to a single sequential control flow in a process. Multiple threads can run concurrently in a process, and each thread performs different tasks in parallel. The thread contains the information necessary to represent the execution environment within the process, including the thread ID representing the thread in the process, a set of register values, stack, scheduling priority and strategy, signal mask words, errno constants, and thread private data. All information about a process is shared by all threads of the process, including executable program text, the program's global memory and heap memory, stack, and file descriptors.

"Process - the smallest unit of resource allocation, thread - the smallest unit of program execution"

        The process has an independent address space. After a process crashes, it will not affect other processes in protected mode, and threads are just different execution paths in a process. Threads have their own stacks and local variables, but threads do not have separate address spaces. The death of one thread is equivalent to the death of the entire process. Therefore, multi-process programs are more robust than multi-thread programs, but they consume more resources when switching processes. Larger, less efficient. However, for some concurrent operations that require simultaneous operation and sharing of certain variables, only threads, not processes, can be used.

2. Reasons for using threads

        The difference between processes and threads. In fact, these differences are the reasons why we use threads. In general, the process has an independent address space, and the thread does not have an independent address space (threads in the same process share the address space of the process).

One of the reasons         for using multithreading is that it is a very "frugal" way to multitask compared to processes. We know that under a Linux system, when starting a new process, it must be assigned an independent address space and establish numerous data tables to maintain its code segment, stack segment and data segment. This is an "expensive" multi-tasking process. Way of working. Multiple threads running in a process use the same address space with each other and share most of the data. The space taken to start a thread is much less than the space taken to start a process. Moreover, threads switch between each other. The time required is also much less than the time required to switch between processes.

The second reason         for using multi-threading is the convenient communication mechanism between threads. For different processes, they have independent data spaces, and data can only be transferred through communication. This method is not only time-consuming, but also very inconvenient. This is not the case with threads. Since the data space is shared between threads in the same process, the data of one thread can be directly used by other threads, which is not only fast, but also convenient. Of course, data sharing also brings some other problems. Some variables cannot be modified by two threads at the same time. Some data declared as static in subroutines are more likely to bring catastrophic blows to multi-threaded programs. These are correct. This is the most important thing to pay attention to when writing multi-threaded programs.

Compared with processes, multi-threaded programs, as a multi-tasking and concurrent working method, certainly have the following advantages:

  • Improve application responsiveness. This is especially meaningful for graphical interface programs. When an operation takes a long time, the entire system will wait for the operation. At this time, the program will not respond to keyboard, mouse, and menu operations. Using multi-threading technology will take a long time. Placing the operation (time consuming) in a new thread can avoid this embarrassing situation.
  • Make multi-CPU systems more efficient. The operating system will ensure that when the number of threads is not greater than the number of CPUs, different threads run on different CPUs.
  • Improve program structure. A long and complex process can be divided into multiple threads and become several independent or semi-independent running parts. Such a program will be easier to understand and modify.





2. Overview of thread development API on Linux

 Multi-threaded development is already supported by the mature pthread library on the Linux platform. The most basic concepts involved in multi-threaded development mainly include three points: threads, mutex locks, and conditions. Among them, thread operations are divided into three types: thread creation, exit, and waiting. Mutex locks include 4 operations, namely creation, destruction, locking and unlocking. There are 5 types of conditional operations: create, destroy, trigger, broadcast and wait.

Please see the table below for details:

1. API related to the thread itself 

        1. Thread creation       

#include <pthread.h>
int pthread_create(pthread_t *restrict tidp, const pthread_attr_t *restrict attr, void *(*start_rtn)(void *), void *restrict arg);
// 返回:若成功返回0,否则返回错误编号

        When pthread_create returns successfully, the memory unit pointed to by tidp is set to the thread ID of the newly created thread. The attr parameter is used to customize various thread attributes. You can temporarily set it to NULL to create a thread with default attributes.

        The newly created thread starts running from the address of the start_rtn function, which has only one untyped pointer parameter arg. If you need to pass more than one parameter to the start_rtn function, you need to put these parameters into a structure, and then pass the address of this structure as the arg parameter. The arg parameter can allow the child thread to operate.

void *func1(void *arg)
int param = 100;
pthread_t t1;
ret = pthread_create(&t1,NULL,func1,(void *)&param);

        2. Thread exits

        A single thread can exit in three ways, stopping its control flow without terminating the entire process:

  1) The thread just returns from the startup routine, and the return value is the thread's exit code.

  2) Threads can be canceled by other threads in the same process.

  3) The thread calls pthread_exit.

#include <pthread.h>
int pthread_exit(void *rval_ptr);
//rval_ptr是一个无类型指针,与传给启动例程的单个参数类似。
//进程中的其他线程可以通过调用pthread_join函数访问到这个指针。

        3. Thread waiting

#include <pthread.h>
int pthread_join(pthread_t thread, void **rval_ptr);
// 返回:若成功返回0,否则返回错误编号

        The thread calling this function will block until the specified thread calls pthread_exit and returns from the startup routine. If the routine simply returns from its startup routine, rval_ptr will contain the return code.

        Threads can be automatically placed in a detached state by calling pthread_join so that resources can be restored. If the thread is already in a detached state, the pthread_join call will fail and return EINVAL.

        If you are not interested in the thread's return value, you can set rval_ptr to NULL. In this case, calling the pthread_join function will wait for the specified thread to terminate, but will not obtain the thread's termination status.

  4. Thread detachment

        A thread is either joinable (default) or detached. When a joinable thread terminates, its thread ID and exit status are retained until another thread calls pthread_join on it. Detached threads are like daemons. When they terminate, all related resources are released, and we cannot wait for them to terminate. If one thread needs to know when another thread terminates, it's best to keep the second thread rendezvous.

The pthread_detach function changes the specified thread to the detached state.

#include <pthread.h>
int pthread_detach(pthread_t thread);
// 返回:若成功返回0,否则返回错误编号

This function is usually used by a thread that wants to detach itself, as in the following statement:

pthread_detach(pthread_self());

        5. Thread ID acquisition and comparison

#include <pthread.h>
pthread_t pthread_self(void);
// 返回:调用线程的ID

 For thread ID comparison, for portable operations, we cannot simply treat thread IDs as integers, because different systems may define thread IDs differently. We should use the following function:

#include <pthread.h>
int pthread_equal(pthread_t tid1, pthread_t tid2);
// 返回:若相等则返回非0值,否则返回0

  For multi-threaded programs, we often need to synchronize these multiple threads. Synchronization refers to allowing only one thread to access a certain resource within a certain period of time. During this time, other threads are not allowed to access the resource. We can synchronize resources through mutex, condition variable and reader-writer lock.

2. API related to mutex lock

        A mutex is essentially a lock. The mutex is locked before accessing a shared resource, and the lock on the mutex is released after the access is completed. After the mutex is locked, any other thread that attempts to lock the mutex again will be blocked until the current thread releases the mutex lock. If multiple threads are blocked when the mutex is released, all threads blocked on the mutex will become runnable. The first thread to become runnable can lock the mutex, and other threads will You will see that the mutex is still locked, and you can only go back and wait for it to become available again. In this way, only one thread can run forward at a time.

        Mutex variables are represented by the pthread_mutex_t data type. A mutex variable must be initialized before using it. You can set it to the constant PTHREAD_MUTEX_INITIALIZER (only for statically allocated mutexes), or you can initialize it by calling the pthread_mutex_init function. If the mutex is allocated dynamically (for example, by calling the malloc function), pthread_mutex_destroy needs to be called before the memory is released.     

 pthread_mutex_init(&mutex,NULL);
 pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

  1. Create and destroy mutex locks


#include <pthread.h>
int pthread_mutex_init(pthread_mutex_t *restrict mutex, const pthread_mutexattr_t *restrict attr);
int pthread_mutex_destroy(pthread_mutex_t mutex);
// 返回:若成功返回0,否则返回错误编号

        To initialize a mutex with default attributes, simply set attr to NULL.

        2. Locking and unlocking

#include <pthread.h>
int pthread_mutex_lock(pthread_mutex_t mutex);
int pthread_mutex_unlock(pthread_mutex_t mutex);
int pthread_mutex_trylock(pthread_mutex_t mutex);
// 返回:若成功返回0,否则返回错误编号

        If a thread does not want to be blocked, it can use pthread_mutex_trylock to try to lock the mutex. If the mutex is in an unlocked state when pthread_mutex_trylock is called, then pthread_mutex_trylock will lock the mutex without blocking and return 0. Otherwise, pthread_mutex_trylock will fail and cannot lock the mutex and return EBUSY.

        Using a mutex lock will cause lock-up. Only when two locks are used, the lock-up state will occur. If one lock is used, it will not occur. When locked, the program cannot continue to run forward. The process of lockup is roughly as follows: thread A acquires lock 1, sleeps for 1 second, acquires lock 2, thread B acquires lock 2, sleeps for 1 second and acquires lock 1.   

3. API related to condition variables

        

        Condition variables are another synchronization mechanism available for threads. Condition variables provide a place for multiple threads to meet. Condition variables, when used with mutexes, allow threads to wait for a specific condition to occur in a race-free manner.

  The condition itself is protected by a mutex. The thread must first lock the mutex before changing the condition state. Other threads will not notice this change until they obtain the mutex, because the mutex must be locked before the condition can be evaluated.

  Condition variables must be initialized before use. The condition variable represented by the pthread_cond_t data type can be initialized in two ways. The constant PTHREAD_COND_INITIALIZER can be assigned to the statically allocated condition variable, but if the condition variable is dynamically allocated, you can use the pthread_cond_destroy function to condition the condition variable. Variables are deinitialized.

        1. Creation and destruction conditions change

#include <pthread.h>
int pthread_cond_init(pthread_cond_t *restrict cond, const pthread_condattr_t *restrict attr);
int pthread_cond_destroy(pthread_cond_t cond);
// 返回:若成功返回0,否则返回错误编号

Unless you need to create a condition variable with non-default attributes, the attr parameter of the pthread_cont_init function can be set to NULL.

        2. Wait

#include <pthread.h>
int pthread_cond_wait(pthread_cond_t *restrict cond, pthread_mutex_t *restrict mutex);
int pthread_cond_timedwait(pthread_cond_t *restrict cond, pthread_mutex_t *restrict mutex, cond struct timespec *restrict timeout);
// 返回:若成功返回0,否则返回错误编号

  pthread_cond_wait waits for a condition to become true. If the condition is not met within the given time, a return variable representing an error code is generated. The mutex passed to pthread_cond_wait protects the condition, and the caller passes the locked mutex to the function. The function puts the calling thread on the list of threads waiting for the condition, and then unlocks the mutex. Both operations are atomic operations. This closes the time channel between the condition check and the thread going to sleep to wait for the condition to change, so that the thread does not miss any changes in the condition. When pthread_cond_wait returns, the mutex is locked again.

  The pthread_cond_timedwait function works similarly to the pthread_cond_wait function, except that there is one more timeout. timeout specifies the waiting time, which is specified through the timespec structure.

        3. Trigger

#include <pthread.h>
int pthread_cond_signal(pthread_cond_t cond);
int pthread_cond_broadcast(pthread_cond_t cond);
// 返回:若成功返回0,否则返回错误编号

        These two functions can be used to notify threads that conditions have been met. The pthread_cond_signal function will wake up a thread waiting for the condition, and the pthread_cond_broadcast function will wake up all processes waiting for the condition.

  Note that you must send a signal to the thread after changing the conditional state.

2. Example

Example 1: Create a simple thread : The following program creates a simple thread t1, which is responsible for printing the variable param in the thread.

#include <stdio.h>

#include <pthread.h>

*func1(void *arg)
{
        printf("t1:%ld ptread  creat success\n",(unsigned long )pthread_self());
        printf("t1:paaram is %d\n",*((int *)arg));


}

int main()
{

        int ret;
        int param = 100;
        pthread_t t1;
        ret = pthread_create(&t1,NULL,func1,(void *)&param);
        if(ret == 0){

                printf("main:creat t1 success\n");
        }
        printf("main:%ld\n",(unsigned long)pthread_self());

        return 0;

}

Program running results:

Obviously, the process of this program does not execute thread t1.

  The specific reason is that there is competition between the main thread and the new thread: the main thread needs to wait for the new thread to finish executing before the main thread exits. If the main thread does not wait, it may exit, so that the entire process will be terminated before the new thread has a chance to run. This behavior depends on the operating system's thread implementation and scheduling methods.

Example 2: To solve the above problem, introduce pthread_join(t1,NULL) in the main thread; the purpose is that the main thread will enter the blocking state after executing here to wait for the new thread to finish executing.

#include <stdio.h>
#include <pthread.h>

void *func1(void *arg)
{
        printf("t1:%ld ptread  creat success\n",(unsigned long )pthread_self());
        printf("t1:paaram is %d\n",*((int *)arg));


}

int main()
{

        int ret;
        int param = 100;
        pthread_t t1;
        ret = pthread_create(&t1,NULL,func1,(void *)&param);
        if(ret == 0){

                printf("main:creat t1 success\n");
        }
        printf("main:%ld\n",(unsigned long)pthread_self());


        pthread_join(t1,NULL);

        return 0;

}
 





The results of running the program are as follows: 

 Example 3: Create a thread and call pthread_exit((void *)&ret); with return value

#include <stdio.h>

#include <pthread.h>

void *func1(void *arg)
{
        static int ret = 10;
        printf("t1:%ld ptread  creat success\n",(unsigned long )pthread_self());
        printf("t1:paaram is %d\n",*((int *)arg));

        pthread_exit((void *)&ret);
}

int main()
{
        int *pret;

        int ret;
        int param = 100;
        pthread_t t1;
        ret = pthread_create(&t1,NULL,func1,(void *)&param);
        if(ret == 0){

                printf("main:creat t1 success\n");
        }
        printf("main:%ld\n",(unsigned long)pthread_self());


        pthread_join(t1,(void **)&pret);
        printf("main:ti quit:%d\n",*pret);
        return 0;

}
   

operation result:

 Return value strings are also available.

Example 4: Create multiple threads

#include <stdio.h>

#include <pthread.h>

void *func1(void *arg)
{
        static int ret = 10;
        printf("t1:%ld ptread  creat success\n",(unsigned long )pthread_self());
        printf("t1:paaram is %d\n",*((int *)arg));

        pthread_exit((void *)&ret);
}

void *func2(void *arg)
{
        static int ret = 10;
        printf("t2:%ld ptread  creat success\n",(unsigned long )pthread_self());
        printf("t2:paaram is %d\n",*((int *)arg));

        pthread_exit((void *)&ret);

}
int main()
{
        int *pret;

        int ret;
        int param = 100;
        pthread_t t1;
        pthread_t t2;

        ret = pthread_create(&t1,NULL,func1,(void *)&param);
        if(ret == 0){

                printf("main:creat t1 success\n");
        }
        ret = pthread_create(&t2,NULL,func2,(void *)&param);
        if(ret == 0){

                printf("main:creat t2 success\n");
        }
        printf("main:%ld\n",(unsigned long)pthread_self());


        pthread_join(t1,(void **)&pret);
        pthread_join(t2,(void **)&pret);
        printf("main:ti quit:%d\n",*pret);
        return 0;

}

operation result:

 Judging from the running results, the results of the two runs are different, indicating that there is competition between the two threads. This makes it inconvenient for us to let which thread execute first. The way to solve this phenomenon is to introduce a mutex lock.

Example 5: Let thread 1 execute and exit after counting to 3. It is recommended that the mutex lock be used in conjunction with conditions. The following example is not very suitable.

#include <stdio.h>

#include <pthread.h>

pthread_mutex_t mutex;
int g_data = 0;

void *func1(void *arg)
{
        printf("t1:%ld ptread  creat success\n",(unsigned long )pthread_self());
        printf("t1:paaram is %d\n",*((int *)arg));


        pthread_mutex_lock(&mutex);
        while(1){
                printf("t1:%d\n",g_data++);
                sleep(1);
                if(g_data == 3){
                        pthread_mutex_unlock(&mutex);
                        pthread_exit(NULL);
                }
        }
}
void *func2(void *arg)
{

        printf("t2:%ld ptread  creat success\n",(unsigned long )pthread_self());
        printf("t2:paaram is %d\n",*((int *)arg));

        while(1){
                printf("t2:%d\n",g_data);
                pthread_mutex_lock(&mutex);
                g_data++;
                pthread_mutex_unlock(&mutex);
               sleep(1);
        }
}

int main()
{
      
        int ret;
        int param = 100;
        pthread_t t1;
        pthread_t t2;
        pthread_mutex_init(&mutex,NULL);
        ret = pthread_create(&t1,NULL,func1,(void *)&param);
        if(ret == 0){

                printf("main:creat t1 success\n");
        }
        ret = pthread_create(&t2,NULL,func2,(void *)&param);
        if(ret == 0){

                printf("main:creat t2 success\n");
        }
        printf("main:%ld\n",(unsigned long)pthread_self());

        while(1){

                printf("t2:%d\n",g_data);

                sleep(1);
        }
        pthread_join(t1,NULL);
        pthread_join(t2,NULL);
        pthread_mutex_destroy(&mutex);
        return 0;
}                                                                        





Summarize

Reference book "Advanced Programming in UNIX Environment"

Reference: A preliminary study on Linux multi-thread programming - Fengzi_Looking up to the sunshine - Blog Garden (cnblogs.com)

The above is my learning experience about Linux threads. Please bear with me if I am a newbie. If it is helpful to you, please like and support it.

Guess you like

Origin blog.csdn.net/qq_44848795/article/details/122012253