Multi-thread control explanation and code implementation

multi-threaded control

Review the concept of threads

Thread is the basic unit of CPU scheduling, and process is the basic unit responsible for allocating system resources. In the design of linux, the data structure of the thread is not specially designed, but the data structure of the PCB is directly reused. Each new thread (a pointer in task_struct{} points to the virtual memory mm_struct structure, which realizes the sharing of the same code and owns part of the resources of the process)

In Linux, all execution flows are regarded as lightweight processes, so there is a user-level native thread library to provide users with a "thread" interface (lightweight threads for OS).

Seeing the Robustness of Threads from Signals, Exceptions and Resources

An exception occurs in one thread, which will affect other threads

#include <iostream>
#include <string>
#include <pthread.h>
#include <unistd.h>

using namespace std;

void* start_routime(void* args)
{
    string name = static_cast<const char*>(args);//安全地进行强制类型转换
    // 如果是一个执行流,那么不可能同时执行2个死循环
    int count = 0;
    while(1)
    {
        cout << "new thread is created! name: " << name << endl;
        sleep(1);
        count++;

        if(count == 5)
        {
            int* p = nullptr;
            *p = 10;//故意写一个解引用空指针,我们知道是会报段错误
        }
    }
}

int main()
{
    pthread_t thread;
    pthread_create(&thread, nullptr, start_routime, (void*)"thread new");

    while(1)
    {
        cout << "new thread is created! name: main" << endl;
        sleep(1);
    }

    return 0;
}

command line error

[yyq@VM-8-13-centos 2023_03_18_multiThread]$ ./mythread 
new thread is created! name: main
new thread is created! name: thread new
new thread is created! name: main
new thread is created! name: thread new
new thread is created! name: main
new thread is created! name: thread new
new thread is created! name: main
new thread is created! name: thread new
new thread is created! name: main
new thread is created! name: thread new
new thread is created! name: main
Segmentation fault

It can be seen from this that when a thread is abnormal, it will directly affect the normal operation of other threads. Because the signal is sent to the process as a whole, the essence is to send the signal to the pid of the corresponding process , and all the thread pid values ​​​​of a process are the same , and the same signal will be written to the PCB of each thread, and the signal is received After that, all processes quit.

From another perspective, the resources that each thread depends on are given by the process. When a thread is abnormal and the process receives an exit signal, the OS reclaims the resources of the entire process, while the resources of other threads are given by the process. , so all threads will all exit.

The above is to look at the issue of thread robustness from the perspective of signal + exception + resource.

errno of the POSIX thread library

What we are studying is the POSIX thread library, which has the following characteristics

1. Functions related to threads form a complete series, and the names of most functions pthread_start with each other;

2. To use these function libraries, you must introduce the header <pthread.h>;

3. When linking these thread function libraries, -l pthreadthe option of the compiler command should be used.

When a function such as pthread in the user-level thread library makes an error, the global variable errno will not be set (although most other POSIX functions will do so), but the error code will be returned through the return value . Because threads share a resource, if multiple threads access the same global variable (errno is also a global variable), it will cause some problems due to lack of access control. Therefore, for errors in the pthreads function, it is recommended to pass the return value to judge.

Simple understanding of clone

Allows users to create a process/lightweight process, fork()/vfork() is implemented by calling clone.

#include <sched.h>
功能:创建一个进程/轻量级进程

原型
	int clone(int (*fn)(void *), void *child_stack, int flags, void *arg, .../* pid_t *ptid, struct user_desc *tls, pid_t *ctid */ );

参数
    child_stack:子栈(用户栈)
    flags:
返回值
    创建失败,返回-1;创建成功,返回线程ID

/* Prototype for the raw system call */

long clone(unsigned long flags, void *child_stack, void *ptid, void *ctid, struct pt_regs *regs);

Create multiple threads

#include <iostream>
#include <string>
#include <vector>
#include <pthread.h>
#include <unistd.h>

using namespace std;

void* start_routime(void* args)
{
    string name = static_cast<const char*>(args);//安全地进行强制类型转换
    while(1)
    {
        cout << "new thread is created! name: " << name << endl;
        sleep(1);
    }
}

int main()
{
    vector<pthread_t> threadIDs (10, pthread_t());
    for(size_t i = 0; i < 10; i++)
    {
        pthread_t tid;
        char nameBuffer[64];
        snprintf(nameBuffer, sizeof(nameBuffer), "%s : %d", "thread", i);
        pthread_create(&tid, nullptr, start_routime, (void*)nameBuffer);//是缓冲区的起始地址,无法保证创建的新线程允许先后次序
        threadIDs[i] = tid;
        sleep(1);
    }

    for(auto e : threadIDs)
    {
        cout << e << endl;
    }

    while(1)
    {
        cout << "new thread is created! name: main" << endl;
        sleep(1);
    }
    return 0;
}

Phenomenon: When the sleep statement is not added to the loop that creates multiple threads, the output we may see is always a certain thread.

Analysis: When we create new threads, each thread is an independent execution flow. First of all, when multiple new threads are created, it is uncertain who will run first; second, because the nameBuffer is shared by all threads, the main thread passes the starting address of the buffer to each thread, and the nameBuffer is always being used in the loop The main process is updated, so each process can get the latest process id after being updated by the main process.

Multi-threaded data private

So, if we want each thread to execute code independently, this way of writing is wrong, so how to pass the correct structure to the thread? Since nameBuffer has the same address as the same variable, then pass in a different address each time

//当成结构体使用
class ThreadData
{
public:
    pthread_t tid;
    char nameBuffer[64];    
};
//对应的操作函数如下
void* start_routime(void* args)//args传递的时候,也是拷贝了一份地址,传过去。不管是传值传参还是传引用传参,都会发生拷贝
{
    ThreadData* td = static_cast<ThreadData*>(args);//安全地进行强制类型转换
    int cnt = 10;
    while(cnt)
    {
        cout << "new thread is created! name: " << td->nameBuffer << " 循环次数cnt:" << cnt << endl;
        cnt--;
        sleep(1);
    }
    delete td;
    return nullptr;
}
int main()
{
    vector<ThreadData*> threads;
    for(size_t i = 0; i < 10; i++)
    {
        // 此处 td是指针,传给每个线程的td指针都是不一样的,实现数据私有
        ThreadData* td = new ThreadData();
        snprintf(td->nameBuffer, sizeof(td->nameBuffer), "%s : %d", "thread", i + 1);
        pthread_create(&td->tid, nullptr, start_routime, (void*)td);
        threads.push_back(td);
    }

    for(auto& e : threads)
    {
        cout << "create thread name: " << e->nameBuffer << " tid:" << e->tid << endl;
    }

    int cnt = 7;
    while(cnt)
    {
        cout << "new thread is created! name: main" << endl;
        cnt--;
        sleep(1);
    }
    return 0;
}

By passing the structure pointer from new, multi-threaded data privateness is realized!

reentry state

start_routime()The function is accessed by 10 threads at the same time and is in a reentrant state during program execution. From the perspective of variables, since the function does not access global variables, it only accesses local variables, so it is a reentrant function. [Strictly speaking, this is not a reentrant function, because cout accesses files, and we only have one monitor, and errors may occur when outputting to the monitor]

Atomic operations on global variables are reentrant functions.

independent stack space

Each thread has its own independent stack space

thread ID

The thread id is the starting address of its independent stack space

Thread waits for pthread_join

**join is a blocking wait. **Threads are also to be waited for, if not waited, it will cause a problem like a zombie process - memory leak. Function 1. Obtain thread exit information; 2. Recycle thread resources. But unlike a process, a thread does not need to obtain an exit signal, because once the thread is abnormal and the signal is received, the entire process will exit.

pthread_join does not consider the abnormal problem, and the process handles it directly when the thread is abnormal.

The type of the return value of start_routime is void*, and the type of the retval parameter in pthread_join() is void**. There is a connection between the two.

#include <pthread.h>
int pthread_join(pthread_t thread, void **retval);
参数
    thread:线程id
    retval输出型参数:用于获取线程函数结束时,返回的退出结果
返回值
    成功返回0,失败返回错误码
        
具体使用:
void* retval = nullptr;//相当于把start_routime返回的指针(这里的指针是指针地址,是个字面值)存到ret(这里的ret是指针变量)里面去
int n = pthread_join(tid, &retval);
assert(n == 0);

When the thread terminates, it can return a pointer (such as the address of the heap space, the address of the object, etc.), and can be fetched by the main thread, so that information interaction can be completed.

For example, a process has blocking waiting and non-blocking waiting, and the signal capture function can be signal(SIGCHLD, SIG_IGN);set ; while the process does not have non-blocking waiting.

Thread detachment pthread_detach

By default, the new threads we create are all joinable. After the thread exits, pthread_join operation is required on it, otherwise the resources cannot be released, resulting in system leaks. If you don’t care about the return value of the thread, join is a burden. This Sometimes, we can tell the system that when the thread exits, the thread resources are automatically released.

功能:分离线程,与joinable是互斥的
#include <pthread.h>
原型:
	int pthread_detach(pthread_t thread);
返回值:
    成功返回0;失败返回错误码,但不设置错误码,不被设置到errno
//使用1:线程自己分离自己
pthread_detach(pthread_self());
//使用2:主线程分离其他线程

When the thread separates itself, the main thread calls pthread_join() [need to actively let the main process join and then detach to execute], at this time the jion function will return 22, indicatingInvalid argument

Why let detach execute first, because it is uncertain who will execute first between the new thread and the main thread. When the new thread executes its own task, assuming that the new thread has not had time to execute detach, but the main thread has already joined, then detach It is invalid.

功能:获取调用该函数的线程ID
#include <pthread.h>
pthread_t pthread_self(void);

thread terminated

Thread exit return/pthread_exit

  1. return nullptr;Return returns to indicate that the thread terminates.

  2. pthread_exit(nullptr);A dedicated pthread_exit() function for thread exit.

    #include <pthread.h>
    void pthread_exit(void *retval);

exit is used to terminate a process, not a thread. Calling exit from any execution flow causes the entire process to exit.

Did you find out, both return and pthread_exit have a nullptr parameter! This return value will be placed in the pthread library.

When the subsequent thread waits, it goes to the pthread library to get this value.

Thread cancellation pthread_cancel

The prerequisite for a thread to be canceled is that the thread is already running, and the main thread sends a cancel command to the corresponding thread. The exit code received retvalis -1, and -1 is actually a macro #define PTHREAD_CANCELED ((void*) -1).

Native thread library pthread

Looking at the native thread library from an upper-level perspective (from the language level):

On Linux, if any language wants to implement multithreading, it must use the pthread library. How do you think about multithreading in C++11? The essence of multithreading in C++11 is the encapsulation of the pthread library in the Linux environment .

#include <iostream>
#include <thread>
#include <unistd.h>
using namespace std;

void thread_run()
{
    int cnt = 5;
    while(cnt)
    {
        cout << "我是新线程" << endl;
        sleep(1);
    }
}

int main()
{
    thread t1(thread_run);

    while(true)
    {
        cout << "我是主线程" << endl;
    }

    t1.join();
    return 0;
}
//这份代码用g++编译,如果不带-lpthread选项,就会报错!说明C++就是封装了原生线程库

The code written in the native thread library is not cross-platform, but it is more efficient; the code written in C++ is common to multiple platforms, but the efficiency is low.

The native thread library is a shared library that can be used by multiple users at the same time. So how to manage the threads created by users?

The solution given by linux is to let the native thread library use a certain method to manage the threads created by users, but the thread attributes that need to be added in the library are relatively small (including thread id, etc.), and there will be a structure union pthread_attr_t{};. Then correspond to the lightweight processes in the kernel one by one, and the kernel provides scheduling of thread execution flow. Linux user-level thread: kernel lightweight process = 1:1 .

insert image description here

Thread local storage __thread

Global variables are stored in the initialized data area of ​​the process address space, and global variables modified by __thread are stored in thread local storage (in the thread structure of the shared area). This is unique to threads, a storage scheme between global variables and local variables.

Guess you like

Origin blog.csdn.net/m0_61780496/article/details/129766470